whitequark / parser

A Ruby parser.
Other
1.59k stars 199 forks source link

- lexer.rl: fix incompatible delimiters on percent literal #808

Closed pocke closed 3 years ago

pocke commented 3 years ago

CRuby only accepts ASCII characters except [A-Za-z0-9] as a delimiter of percent literal, but the lexer accepts different characters. For exmaple:

This patch fixes the problems.

Investigation of CRuby

CRuby parses percent literals here: https://github.com/ruby/ruby/blob/6072239121360293dbd2ed607f16b6a11668999a/parse.y#L8718-L8815

ASCII delimiters

I confirmed Ruby 1.8 or greater accept all ASCII characters except alnum as delimiters with the following code.

test.rb

(0..127).each do |n|
  next if /[a-zA-Z0-9()<>{}\[\]]/ =~ n.chr
  eval "%q#{n.chr}foo#{n.chr}"
end
$ docker run -it --rm -v $(pwd)/test.rb:/tmp/test.rb rubylang/all-ruby env ALL_RUBY_SINCE=1.8 ./all-ruby /tmp/test.rb
ruby-1.8.0
...
ruby-2.0.0-p648
ruby-2.1.0-preview1   (eval):1: warning: encountered \r in middle of line, treated as a mere space
...
ruby-2.7.4            (eval):1: warning: encountered \r in middle of line, treated as a mere space
ruby-3.0.0-preview1
...
ruby-3.0.2

Number and multibyte delimiter

I also confirmed Ruby 1.8 or greater reject 1 and as delimiters with the following commands.

$ docker run -it --rm rubylang/all-ruby env ALL_RUBY_SINCE=1.8 ./all-ruby -ce '%q1foo1'
ruby-1.8.0            -e:1: unknown type of %string
                      %q1foo1
                         ^
                  exit 1
...
ruby-2.4.10           -e:1: unknown type of %string
                      %q1foo1
                         ^
                  exit 1
ruby-2.5.0-preview1   -e:1: unknown type of %string
                      %q1foo1
                      ^~~
                  exit 1
...
ruby-2.6.8            -e:1: unknown type of %string
                      %q1foo1
                      ^~~
                  exit 1
ruby-2.7.0-preview1   -e:1: unknown type of %string
                      %q1foo1
                      ^~~
                  exit 1
ruby-2.7.0-preview2   -e:1: unknown type of %string
                      %q1foo1
                      ^~~
                  exit 1
...
ruby-3.0.2            -e:1: unknown type of %string
                      %q1foo1
                      ^~~
                  exit 1
$ docker run -it --rm rubylang/all-ruby env ALL_RUBY_SINCE=1.8 ./all-ruby -ce '%q★foo★'
ruby-1.8.0            -e:1: Invalid char `\230' in expression
                      -e:1: Invalid char `\205' in expression
                  exit 2
ruby-1.8.1            -e:1: Invalid char `\230' in expression
                      -e:1: Invalid char `\205' in expression
                  exit 1
...
ruby-1.8.7-p374       -e:1: Invalid char `\230' in expression
                      -e:1: Invalid char `\205' in expression
                  exit 1
ruby-1.9.0-0          -e:1: unknown type of %string
                      %q★foo★
                         ^
                  exit 1
...
ruby-2.4.10           -e:1: unknown type of %string
                      %q★foo★
                         ^
                  exit 1
ruby-2.5.0-preview1   -e:1: unknown type of %string
                      %q★foo★
                      ^~~
                  exit 1
...
ruby-2.6.8            -e:1: unknown type of %string
                      %q★foo★
                      ^~~
                  exit 1
ruby-2.7.0-preview1   -e:1: unknown type of %string
                      %q★foo★
                      ^~~
                  exit 1
ruby-2.7.0-preview2   -e:1: unknown type of %string
                      %q★foo★
                      ^~~
                  exit 1
...
ruby-2.7.4            -e:1: unknown type of %string
                      %q★foo★
                      ^~~
                  exit 1
ruby-3.0.0-preview1   -e:1: invalid multibyte char (US-ASCII)
                  exit 1
...
ruby-3.0.2            -e:1: invalid multibyte char (US-ASCII)
                  exit 1