ruby / prism

Prism Ruby parser
MIT License
808 stars 136 forks source link

`Parser::Translator` is accepting certain regexp flags where `parser` would raise #2957

Closed Earlopain closed 1 month ago

Earlopain commented 1 month ago

With plain parser, the following raises an error:'/あ/n', 3.3)
# => 'String#encode': U+3042 from UTF-8 to ASCII-8BIT (Encoding::UndefinedConversionError)

Prism translation seems to ignore the n flag (but returns no ast):'/あ/n', 3.3, parser_engine: :parser_prism)
 @buffer=#<Parser::Source::Buffer (string)>,
    @location=#<Parser::Source::Range (string) 4...4>,
    @message="regexp encoding option 'n' differs from source encoding 'UTF-8'",
    @location=#<Parser::Source::Range (string) 4...4>,
    @message="/.../n has a non escaped non ASCII character in non ASCII-8BIT script: /あ/",
  [#<RuboCop::AST::Token:0x00007619a8ea8ea8 @pos=#<Parser::Source::Range (string) 0...1>, @text="/", @type=:tREGEXP_BEG>,
   #<RuboCop::AST::Token:0x00007619a8ea8e80 @pos=#<Parser::Source::Range (string) 1...2>, @text="あ", @type=:tSTRING_CONTENT>,
   #<RuboCop::AST::Token:0x00007619a8ea8e58 @pos=#<Parser::Source::Range (string) 2...3>, @text="/", @type=:tSTRING_END>,
   #<RuboCop::AST::Token:0x00007619a8ea8e30 @pos=#<Parser::Source::Range (string) 3...4>, @text="n", @type=:tREGEXP_OPT>]>

There's an open issue in rubocop-ast for this to not raise during parsing ( but still a behaviour difference.

parser has the following code to construct a regexp. Maybe it just needs to be emulated?

kddnewton commented 1 month ago

This seems very odd that you would explicitly want an encoding error, as opposed to going through the normal diagnostics flow. @koic is this desired behavior here?

Earlopain commented 1 month ago

On second thought, you are right. I should have reported this to the parser gem instead, emulating this behaviour doesn't make much sense.