Open max-heller opened 2 years ago
Thanks for the report.
Basically this happens here.
I didn't look at this code for a very long time and just had a quick look but I'm not sure there's an easy fix.
Basically since `ASCII
normalizes U+000D (CR), U+000A (LF) and the sequence <U+000D, U+000A> (CRLF). The easiest is to always report the normalisation on either U+000D
or U+000A
and then suppress the output on U+000A
if the last character was a CR.
Changing this would require to introduce a lookahead in the implementation and introducing one gets you into non-blocking ``
Await`` business handling. Not saying it can be done, just not obvious in the 10mins I spent on this and not sure it's worth fixing now that we have UTF decoding in the stdlib since 4.14.
Just to make sure, are you here because of this code ? Maybe I'd rather help you to get rid of Uutf in favor of the new 4.14 in the stdlib than fix this bug :-) In particular you don't seem to use the non-blocking stuff so that should be easy.
The docs say that
but this contract does not hold for normalized CRLFs: