Open sternenseemann opened 2 years ago
problem comes from the test cases made by go "utf8" TLE.encodeUtf8 TLE.decodeUtf8 CT.utf8
in the test suite
and the error is coming from TLE.decodeUtf8
, not the conduit-extras
code
So conduit-extra
's CT.utf8
is succeeding while TLE.decodeUtf8
is failing.
The \xc2
is pretty consistent. This StackOverflow post. Calling TLE.decodeUtf8
pretty consistently fails with `"\xc2".
Building conduit-extra
with text-1.2.5.0
and I cannot trigger the bug. That makes me suspect the implementation of Data.Streaming.Text.decodeUtf8
Seems to be related to this: https://github.com/fpco/streaming-commons/issues/70
The seemingly related issue https://github.com/fpco/streaming-commons/issues/70 was fixed in https://github.com/fpco/streaming-commons/pull/71. Does that mean that this issue is fixed as well?
The PR does refer to a flaky test, but I don't know if it's this one.
Could this be related to
? Because I see mentions of UTF8 above.
Yeah, [194]
(aka [0xc2]) is supposed to fail to decode that way.
We can also write this byte as 0b11000010
, which makes it more obvious that it has the form of a leading byte for a 2-byte sequence, but there is no trailing 0b10xxxxxx
byte, so it's impossible to decode correctly. It appears that conduit-extra's UTF-8 decoder is NOT failing when it should.
@SamB I am not sure whether you are implying that conduit-extra has UTF-8 decoding separate from the text package? If the issue was fixed in text-2.0.2, does that also fix the issue with conduit-extra?
This happens occasionally, but across architectures. Full build log: