chrsmithdemos / codesearch

Automatically exported from code.google.com/p/codesearch
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

csearch sometimes fails on character classes containing only the same letter in upper and lowercase #8

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
laptop$ cat userids.txt 
dgryski
laptop$ cindex .
2012/01/24 14:48:22 index /[XXXXXX]/
2012/01/24 14:48:22 flush index
2012/01/24 14:48:22 merge 0 files + mem
2012/01/24 14:48:22 8 data bytes, 237 index bytes
2012/01/24 14:48:22 done
laptop$ csearch '[g]r'
/[XXXXX]/userids.txt:dgryski
laptop$ csearch '[Hg]r'
/[XXXXX]/userids.txt:dgryski
laptop$ csearch '[Gg]r'
laptop$ 
laptop$ csearch 'g[Rr]'
/[XXXXX]/userids.txt:dgryski
laptop$ csearch '[Dd]g'
laptop$ csearch '[ZDd]g'
/[XXXXX]/userids.txt:dgryski
laptop$

What is the expected output? What do you see instead?
I expect 'dgryski' to be printed, but instead depending on the regex no lines 
are found.

What version of the product are you using? On what operating system?
070ef10ab799 tip.  Darwin 10.8.0

Please provide any additional information below.

Original issue reported on code.google.com by dgryski on 24 Jan 2012 at 1:59

GoogleCodeExporter commented 9 years ago
Actually, it looks like it fails when the two-letter-character class is the 
first item in the regex.

laptop$ csearch 'd[gG]r'
/[XXXXX]/userids.txt:dgryski

Original comment by dgryski on 24 Jan 2012 at 2:09

GoogleCodeExporter commented 9 years ago
I did a bit of investigation last night and it looks like the problem is with 
match.go:stepByte().  If we need to fold for this particular instruction, we 
uppercase the character 'c', but it doesn't get reset back to the lowercase 
version  when we move on to processing the next state -- the character 'c' is 
still the modified uppercase version instead of the original lowercase version 
that's actually in the string.

I've attached a patch to match.go to fix this, and two test cases to 
regexp_test.go.

Original comment by dgryski on 27 Jan 2012 at 8:51

Attachments:

GoogleCodeExporter commented 9 years ago
This issue was closed by revision 5cd8d184e954.

Original comment by dgryski on 2 May 2012 at 8:05

GoogleCodeExporter commented 9 years ago
Issue 19 has been merged into this issue.

Original comment by dgryski on 2 May 2012 at 8:54