rubocop / rubocop-ast

RuboCop's AST extensions and NodePattern functionality
https://docs.rubocop.org/rubocop-ast
MIT License
104 stars 52 forks source link

Return parser error on parsing invalid encoding regexp #305

Closed r7kamura closed 2 months ago

r7kamura commented 2 months ago

I noticed that if I give ProcessedSource code that contains an invalid encoding regular expression (e.g. /あ/n), it does not catch this encoding error.

$ ruby -r rubocop-ast -e "RuboCop::AST::ProcessedSource.new('/あ/n', 3.3)"
/home/r7kamura/.rbenv/versions/3.3.4/lib/ruby/gems/3.3.0/gems/parser-3.3.4.0/lib/parser/builders/default.rb:2261:in `encode': U+3042 from UTF-8 to ASCII-8BIT (Encoding::UndefinedConversionError)
        from /home/r7kamura/.rbenv/versions/3.3.4/lib/ruby/gems/3.3.0/gems/parser-3.3.4.0/lib/parser/builders/default.rb:2261:in `static_regexp'
        from /home/r7kamura/.rbenv/versions/3.3.4/lib/ruby/gems/3.3.0/gems/parser-3.3.4.0/lib/parser/builders/default.rb:428:in `regexp_compose'
        from /home/r7kamura/.rbenv/versions/3.3.4/lib/ruby/gems/3.3.0/gems/parser-3.3.4.0/lib/parser/ruby33.rb:11614:in `_reduce_572'
        from /home/r7kamura/.rbenv/versions/3.3.4/lib/ruby/gems/3.3.0/gems/racc-1.8.0/lib/racc/parser.rb:263:in `_racc_do_parse_c'
        from /home/r7kamura/.rbenv/versions/3.3.4/lib/ruby/gems/3.3.0/gems/racc-1.8.0/lib/racc/parser.rb:263:in `do_parse'
        from /home/r7kamura/.rbenv/versions/3.3.4/lib/ruby/gems/3.3.0/gems/parser-3.3.4.0/lib/parser/base.rb:190:in `parse'
        from /home/r7kamura/.rbenv/versions/3.3.4/lib/ruby/gems/3.3.0/gems/parser-3.3.4.0/lib/parser/base.rb:238:in `tokenize'
        from /home/r7kamura/.rbenv/versions/3.3.4/lib/ruby/gems/3.3.0/gems/rubocop-ast-1.31.3/lib/rubocop/ast/processed_source.rb:225:in `tokenize'
        from /home/r7kamura/.rbenv/versions/3.3.4/lib/ruby/gems/3.3.0/gems/rubocop-ast-1.31.3/lib/rubocop/ast/processed_source.rb:220:in `parse'
        from /home/r7kamura/.rbenv/versions/3.3.4/lib/ruby/gems/3.3.0/gems/rubocop-ast-1.31.3/lib/rubocop/ast/processed_source.rb:48:in `initialize'
        from -e:1:in `new'
        from -e:1:in `<main>'

This exception is raised from #tokenize, so why not catch the exception raised here as well?

Earlopain commented 2 months ago

Openend an issue for the prism difference https://github.com/ruby/prism/issues/2957

Earlopain commented 2 months ago

Fixed in parser: https://github.com/whitequark/parser/pull/1033

It may still make sense to do this but would hide issues with parser which I think would be bad.

marcandre commented 2 months ago

Thanks @r7kamura and @Earlopain

So now that this is fixed in parser, we should be good without rescuing exceptions here, right? I agree we should avoid doing it if we can.

r7kamura commented 2 months ago

OK, I have confirmed on my end that the latest parser has successfully solved the problem. I'm going to close this issue now that rubocop-ast doesn't have to deal with it.

Thanks a lot!