ammar / regexp_parser

A regular expression parser library for Ruby
MIT License
144 stars 23 forks source link

Possible feature incompatibility with Ruby versions. #84

Closed mbj closed 2 years ago

mbj commented 2 years ago

On my recent mutant work on regexp mutations I found cases where, regexp parser would indicate that ruby at a given version does support a specific unicode property, where ruby apparently would not support it. It could very well also be I use the regexp_parser API wrong.

# report.rb
require 'regexp_parser'

syntax = ::Regexp::Syntax.version_class("ruby/#{RUBY_VERSION}")

puts "RUBY_VERSION:            #{RUBY_VERSION}"
puts "Regexp::Parser::VERSION: #{::Regexp::Parser::VERSION}"
puts "Syntax:                  #{syntax.class}"

puts "Does not recognize while indicated by regexp_parser:"

syntax
  .features
  .fetch(:property, []).each do |property|
    property_specifier = "\\p{#{property}}"

    begin
      /#{property_specifier}/
    rescue RegexpError
      puts property_specifier
    end
  end

I've tested with non EOL head rubies getting me that outputs:

RUBY_VERSION:            2.7.6
Regexp::Parser::VERSION: 2.3.0
Syntax:                  Class
Does not recognize while indicated by regexp_parser:
\p{egyptian_hieroglyph_format_controls}
\p{ottoman_siyaq_numbers}
\p{small_kana_extension}
\p{symbols_and_pictographs_extended_a}
\p{tamil_supplement}
RUBY_VERSION:            3.0.4
Regexp::Parser::VERSION: 2.3.0
Syntax:                  Class
Does not recognize while indicated by regexp_parser:
\p{egyptian_hieroglyph_format_controls}
\p{ottoman_siyaq_numbers}
\p{small_kana_extension}
\p{symbols_and_pictographs_extended_a}
\p{tamil_supplement}
RUBY_VERSION:            3.1.2
Regexp::Parser::VERSION: 2.3.0
Syntax:                  Class
Does not recognize while indicated by regexp_parser:
\p{egyptian_hieroglyph_format_controls}
\p{ottoman_siyaq_numbers}
\p{small_kana_extension}
\p{symbols_and_pictographs_extended_a}
\p{tamil_supplement}

A fix may be easy in removing indicated support, or well: Telling me where I use the API wrong.

jaynetics commented 2 years ago

hey,

you're doing it right!

these properties should not be listed as supported.

my guess is that i've either added them to regexp_parser in error, or they were removed in a later ruby version (2.7.x).

i'll have to investigate a bit and add a spec to detect such "excessive" properties.

thanks for the pointer!

jaynetics commented 2 years ago

@mbj these five properties were never supported. i've just released regexp_parser v2.3.1 with a fix.