ruby-i18n / i18n

Internationalization (i18n) library for Ruby
MIT License
976 stars 408 forks source link

Regex part deux - INTERPOLATION_SYNTAX #669

Closed kbrock closed 1 year ago

kbrock commented 1 year ago

Thanks for the great gem.

I was curious what I could do with INTERPOLATION_SYNTAX.

This has 3 commits.

I ran tests with ruby 2.6.9 and 3.0.6

Not sure when the syntax change for the substring was introduced str[1..]. rubocop suggested I change my str[1..-1] over to that. They also said the backslash in [^\}] was not necessary.

Let me know if you would like to keep INTERPOLATION_SYNTAX and I can throw away the second commit. Or if you like it, I can squash the two. Something was nice about the multiple capture groups in the regular expression, but I didn't feel the complexity (from pcre's perspective) bought too much. But since this is your project, it is your call.

Also in reference to #667

As I started to run numbers, I'm feeling less and less like this is a DoS. So maybe I'm not the right person to state an opinion on whether these changes are necessary.

From the commit messages


/(%)?(%\{([^\}]+)\})/ =~ '%{{'*9999)+'}'

/(%)?(%\{([^\}]+)\})/ ==> 199,984 steps
/(%%?)\{([^\}]+)\}/   ==> 129,989 steps

/(%%?\{[^\}]+\})/     ==>  99,992 steps

But that hasn't reached the TOKENIZER performance, so the second commit went with that one:

/(%%?\{[^\}]+\})/     ==>  99,992 steps
kbrock commented 1 year ago

come to think of it, may be able to skip using this regular expression at all. or if using it, skip on the capture group all together. But feeling this is way overkill, especially since we are in linear time.

radar commented 1 year ago

Not sure when the syntax change for the substring was introduced str[1..].

Ruby 2.6.

This is currently the earliest version of Ruby that i18n supports, so I think it is safe.

radar commented 1 year ago

I like it! Simpler regular expressions will always get my vote.