ammar / regexp_parser

A regular expression parser library for Ruby
MIT License
144 stars 23 forks source link

incompatible character encodings when calling `.to_s` on tree parsed from single regex #74

Closed michaelglass closed 4 years ago

michaelglass commented 4 years ago

related: https://github.com/rubocop-hq/rubocop/issues/9056

Regexp::Parser.parse((/¡#≥
/x).to_s).to_s

expected: all nodes of tree are utf-8 (they all come from a utf-8 source string) actual: raises Encoding::CompatibilityError: incompatible character encodings: UTF-8 and ASCII-8BIT

thank you for your great lib!

jaynetics commented 4 years ago

hi @michaelglass and thank you for the report!

rubocop is really turning out to be a great detector of wrinkles that still need to be ironed out.

i've just released v2.0.0 where this is fixed.

heads up: this version also includes a change to address https://github.com/ammar/regexp_parser/issues/72, which means that any code in rubocop that calculates indices of regexp parts will need to be updated when moving to 2.0.0.