Closed dinamic closed 11 years ago
thank you! can you have a look at the failing test on Travis? @dbu should have a look at you method rename
Sorry, didn't noticed the failing test.
While checking the failed tests I realized there's an issue with my current implementation. It has to use look-ahead to determine the new line sequence, rather than look-behind.
thanks for this PR. i commented on some naming things. can you solve the look-ahead issue you have or do you need input on that?
is the reason you ran into this issue in the first place that your git client converted the line endings? i fear that we still can see problems after this fix. what would happen if we always check for both types of line endings, and maybe add the apple OS endings as well? might be the most flexible approach.
btw, impressive that after 30 years of PC we still see this stupid problem :-(
The look-ahead is solved.
The issue I came upon was invoked because of the hardcoded EOL. It was set like "\n" and for Windows is "\r\n". Later on the process of parsing there was the assumption that the EOL is a single character, which obviously is not true, so the code failed to parse the string.
I'm struggling at the GenericScanner now. There are some naming wtfs that are pretty confusing.
Once I'm done with the GenericScanner, I believe we would be able to handle whatever EOL we like. LF, CR, CR+LF, LF+CR, you name it. IMHO, it is a design fail that caused the issue.
I had the idea of converting the EOL within the reader constructor. Any thoughts on that?
maybe we could simply string replace all possible formats to \n
and then do the char comparison on \n
rather than the php constant. the LF char is not valid anywhere else in the cnd document anyway than as end of line character... this could make the code as simple as it was before.
we load the whole string anyways, and cnd files are not expected to be huge, so doing the replace upfront sounds save to me from a memory perspective.
the constructor then would be
$buffer = str_replace("\r\n", "\n", $buffer);
$buffer = str_replace("\r", "\n", $buffer);
Exactly my point. I was not aware whether the CRLF would be used anywhere inside the document and couldn't decide by myself to do such conversion.
i don't expect any problems, lets convert everything to the one-char \n
in the constructor of BufferReader, and simplify the code again (we won't need peek and multi-char EOL anymore then)
@dinamic can you do the cleanup and convert all cnd files to unix line endings in the constructor, before further handling them?
@dbu In Windows (at least Windows 7) in PHP global constant PHP_EOL by default equal CRLF, not LF. Accordingly, any comparison between currentChar() and PHP_EOL not work, because PHP_EOL contains 2 chars. In this case, it does not matter what kind of line breaks in the cnd files.
I see the most reasonable solution to convert all line breaks in cnd to the LF and replace PHP_EOL in all comparisons also to LF. P.S. Sorry, I casually read discussion, it's an already known.
yes, thats exactly what we should do. i hope dinamic can do it - if you don't want to wait you could fork his repo and finish it and create a new PR
superseded by #51
Thanks for contributing @relo-san. I didn't had the time to finish my fix.
CND data was not correctly parsed while using Windows OS. The issue was caused by the different line endings between the operating systems.
I came up with a fix and updated the unit tests to test both Windows and Unix line endings.
I've found some discrepancies inside the FileReader class, so it is updated as well.
This commit fixes: https://github.com/symfony-cmf/symfony-cmf-standard/issues/21