Closed Earlopain closed 1 month ago
Parser::CurrentRuby.parse("/あ/n") # => .../gems/parser-3.3.3.0/lib/parser/builders/default.rb:2261:in 'String#encode': U+3042 from UTF-8 to ASCII-8BIT (Encoding::UndefinedConversionError)
Offending source: https://github.com/whitequark/parser/blob/570e06520b81a107948d10fadaea89bd612b9a8d/lib/parser/builders/default.rb#L2249-L2268
Here's what prism has to say about this:
Prism.parse("/あ/n") => #<Prism::ParseResult:0x00007fb448c09438 @comments=[], @data_loc=nil, @errors= [#<Prism::ParseError @type=:regexp_encoding_option_mismatch @message="regexp encoding option 'n' differs from source encoding 'UTF-8'" @location=#<Prism::Location @start_offset=6 @length=0 start_line=1> @level=:syntax>, #<Prism::ParseError @type=:regexp_non_escaped_mbc @message="/.../n has a non escaped non ASCII character in non ASCII-8BIT script: /あ/" @location=#<Prism::Location @start_offset=6 @length=0 start_line=1> @level=:syntax>], @magic_comments=[], @source=#<Prism::Source:0x00007fb448b865d8 @offsets=[0], @source="/あ/n", @start_line=1>, @value= @ ProgramNode (location: (1,0)-(1,6)) ├── locals: [] └── statements: @ StatementsNode (location: (1,0)-(1,6)) └── body: (length: 1) └── @ RegularExpressionNode (location: (1,0)-(1,6)) ├── flags: ascii_8bit ├── opening_loc: (1,0)-(1,1) = "/" ├── content_loc: (1,1)-(1,4) = "あ" ├── closing_loc: (1,4)-(1,6) = "/n" └── unescaped: "あ", @warnings= [#<Prism::ParseWarning @type=:void_statement @message="possibly useless use of a literal in void context" @location=#<Prism::Location @start_offset=0 @length=6 start_line=1> @level=:verbose>]>
(Originally raised at https://github.com/ruby/prism/issues/2957 but I think it makes more sense here)
I think, a pretty simple solution would be to simply catch the encoding error and wrap it into a diagnostic. WDYT?
a pretty simple solution would be to simply catch the encoding error and wrap it into a diagnostic
Yes, it makes sense. Would you like to send a PR? If not I'll do it myself in a few days.
Offending source: https://github.com/whitequark/parser/blob/570e06520b81a107948d10fadaea89bd612b9a8d/lib/parser/builders/default.rb#L2249-L2268
Here's what prism has to say about this:
(Originally raised at https://github.com/ruby/prism/issues/2957 but I think it makes more sense here)
I think, a pretty simple solution would be to simply catch the encoding error and wrap it into a diagnostic. WDYT?