PiRSquared17 / re2

Automatically exported from code.google.com/p/re2
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

\w isn't unicode aware #107

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
What is the expected output? What do you see instead?

According to the docs, \w will only match ascii characters.  In Perl 5.10 at 
least, on CentOS 6.4, running perl -C -e 'use utf8; print "yes" if "ó" =~ 
/\w/';
will print yes.  It would be very helpful to have \w be unicode aware.  It 
would also mean that \b could be unicode aware.

What version of the product are you using? On what operating system?

CentOS 6.4

Please provide any additional information below.
NOTE: If you have a suggested patch, please see
http://code.google.com/p/re2/wiki/Contribute
for information about sending it in for review.  Thanks.

Original issue reported on code.google.com by qui...@gmail.com on 27 Feb 2014 at 10:20

GoogleCodeExporter commented 9 years ago
RE2 has moved to GitHub. I have not moved the issues over. If this issue is 
still important to you, please file a new one at 
https://github.com/google/re2/issues. Thank you.

Original comment by rsc@golang.org on 11 Dec 2014 at 4:45