Closed Hywan closed 8 years ago
Slightly off topic but: https://github.com/tagua-vm/parser/blob/master/source/rules/literals.rs#L217 => This can also be 'B'. It also looks like b/B handling is missing for nowdocs (which also support it).
And both things are not in the spec :(
Damn… Could you open issues on this repository please?
And good catch!
I've opened #43
Another inaccuracy in the spec is that <<<'A'\nA\n
(with only a single newline between 'A'
and A
) is a valid nowdoc string (the empty string). Is this properly supported?
Spec fix: https://github.com/php/php-langspec/commit/8e8df911b732cc08ca0ae45f788c918147b9d007
The current nowdoc implementation currently doesn't handle CRLF newlines.
#[test]
fn case_string_nowdoc_empty() {
let input = b"<<<'FOO'\n\nFOO\n";
let output = Result::Done(&b""[..], Literal::String(Vec::new()));
assert_eq!(string_nowdoc(input), output);
assert_eq!(string(input), output);
assert_eq!(literal(input), output);
}
So yes, it is supported.
But you raised another issue where we can have \r\n
instead of \n
only? Where are they located exactly? We can have <<<'FOO'\r\nFOO\n
which is valid (note there are 2 forms)?
@Hywan The test only tests b"<<<'FOO'\n\nFOO\n"
, but not b"<<<'FOO'\nFOO\n"
(which represent the same string).
Hmm. Is it too much to ask you to open issues with detailed examples please?
@nikic About CRLF, the grammar defines a new line as \n
, \r
or \r\n
. I know OS that uses \n
(all Unix), \r\n
(most Windows) but no one with \r
. Why this one?
After some searches, it appears that the following OS define \r
alone: Commodore 8-bit machines, Acorn BBC, ZX Spectrum, TRS-80, Apple II family, Oberon, Mac OS up to version 9, MIT Lisp Machine and OS-9. I am pretty sure we can drop them safely since this is not a binary target for the compiler.
Also, there is \n\r
on Acorn BBC and RISC OS spooled text output, or 0x9b
on Atari 8-bit machines using ATASCII variant of ASCII (155 in decimal). This is very strange forms.
So, is there a good reason to support \r
? It adds complexity into the parser, especially in error management.
@Hywan I guess this is just one of those things that made sense 20 years ago, but is pretty pointless now.
Thanks for your feedback 😄.
See example in https://github.com/php/php-langspec/blob/master/spec/09-lexical-structure.md#heredoc-string-literals:
Currently, spaces before
ID
are not supported.