---Octree-0.5.cabal:39:3: Non breaking spaces at 39:3, 41:3, 43:3
+++Octree-0.5.cabal:39:1: Non breaking spaces at 39:5, 41:5, 43:5
Before: the lexer counted the 2 unbreakable spaces as 2 unicode characters and reported the next position as 3.
After: the lexer counts the 2 unbreakable spaces as 4 bytes and reports the next position as 5.
Cabal shouldn't have relied on this behavior to begin with but it took us some time to realize what the issue was. Also we were asked to report the issue here as the breaking change may violate the PVP.
Upgrading Cabal from using Alex 3.2.6 to using Alex 3.2.7 broke some of its tests.
Upstream ticket: https://github.com/haskell/cabal/pull/8896
My understanding is that Cabal uses a
latin1
lexer but uses it to parse some Unicode tokens (BOM and 2-byte unbreakable space code point). It relied on an Alex bug to count characters taking into account UTF8 characters, even inlatin1
lexers. This bug has been fixed in https://github.com/haskell/alex/commit/ae525e34edf017544e8ef4457d7e57cf2081dcf9#diff-007f894e1221eb8cafde8fdf0ee317bd32859704c27652c29cbb5417a9d5c37dR179-R185 leading to the following kind of changes in Cabal test outputs:Cabal shouldn't have relied on this behavior to begin with but it took us some time to realize what the issue was. Also we were asked to report the issue here as the breaking change may violate the PVP.