runpaint / read-ruby

Free ebook about Ruby 1.9
http://ruby.runpaint.org/
148 stars 28 forks source link

Regexp: Unicode property name casing for XDigit and ASCII #68

Closed ammar closed 14 years ago

ammar commented 14 years ago

The following Unicode property names have incorrect casing:

Xdigit should be XDigit Ascii should be ASCII. Newline should be NEWLINE

According to Oniguruma source: onig-5.9.2/enc/unicode.c:10490

Thanks, Ammar

runpaint commented 14 years ago

I'll change them if you want, but property names have been case insensitive for over a year ( http://github.com/ruby/ruby/commit/ee4b59a4191ecabc1a9d396e234f20be5e5e9f8c ).

ammar commented 14 years ago

I had to double check the error I was seeing. It turns out I was confused by differing behavior from script (where \p{Ascii} fails, and \p{ASCII} works).

In irb any casing works. Both are the same same revision (ruby 1.9.2p0 (2010-08-18 revision 29036)), so it must be my env somehow.

The docs are clear and correct. Closing this one too.

Thanks again. Great work!

runpaint commented 14 years ago

It sounds as if the Regexp in your script isn't being interpreted as UTF-8. Either use the u option (http://goo.gl/xKuG) or a "magic comment" (http://goo.gl/FEJs). I'll open a new ticket to remind to clarify this in the text.