Closed KlausC closed 8 years ago
thanks. i'll look into this asap (i suspect i am treating unicode characters as single bytes or similar somewhere).
heh. it's actually the other way round (which explains how it's looking too far ahead). i thought the offset from the regexp match was in characters (it appears it's actually in bytes), and since it's UTF i was discarding more than i should have been. fixing now.
sorry there was no test for this btw.
oops, closed early. this is not in git, but i'm having to publish manually, so an actual release will take some time....
NOW in git (not "not")
Thanks for the quick response. It works fine now, after I pulled the master branch.
No problem - thanks for the simple report with an obvious thing to test! It's now tagged in v1.7.7.
On Julia v0.4.3 I installed Pkg.add("parserCombinators"); using ParserCombinator and Pkg.installed("ParserCombinator") v"1.7.4"
Then the following test:parse_one("€", p".") # any non-ascii character ERROR: BoundsError: attempt to access 3-element Array{UInt8,1}: 0xe2 0x82 0xac at index [4] in schedule_and_wait at task.jl:343 in consume at task.jl:259 in once at ~/.julia/v0.4/ParserCombinator/src/core/parsers.jl:182 in single_result at ~/.julia/v0.4/ParserCombinator/src/core/parsers.jl:193