Closed GoogleCodeExporter closed 8 years ago
I have tracked it down to
http://us3.php.net/manual/en/regexp.reference.subpatterns.php: "The maximum
number of captured substrings is 99, and the maximum number of all subpatterns,
both capturing and non-capturing, is 200." So the following string is the
smallest that will trigger a crash/timeout:
$test = str_repeat('0',202);
$sql = "'$test'";
And it's caused specifically by the tokenizer, as the following stand-alone
code demonstrates (it will hang if you run it):
$test = str_repeat('0',201);
$sql = "'$test'";
$sql = str_replace(array('\\\'','\\"',"\r\n","\n","()"),array("''",'""'," ","
"," "), $sql);
$regex=<<<EOREGEX
/(`(?:[^`]|``)`|[@A-Za-z0-9_.`-]+(?:\(\s*\)){0,1})
|(\+|-|\*|\/|!=|>=|<=|<>|>|<|&&|\|\||=|\^)
|(\(.*?\)) # Match FUNCTION(...) OR BAREWORDS
|('(?:[^']|'')*'+)
|("(?:[^"]|"")*"+)
|([^ ,]+)
/ix
EOREGEX
;
$tokens = preg_split($regex, $sql,-1, PREG_SPLIT_NO_EMPTY |
PREG_SPLIT_DELIM_CAPTURE);
Original comment by kbacht...@gmail.com
on 20 Oct 2011 at 3:35
Whoops that should read:
$test = str_repeat('0',202);
In the second stand-alone example.
Original comment by kbacht...@gmail.com
on 20 Oct 2011 at 3:35
Here's a simple fix. In the $regex in the parser (~line 168) in split_sql,
change the ' and " parts to have a + after the character classes:
|('(?:[^']+|'')*'+)
|("(?:[^"]+|"")*"+)
^ note the added '+' on both lines
This will cause everything inside the string and between '' and "" delimiters
to be immediately combined into a single match instead of each separate matches
thereby quickly reaching the 200 limit. Of course this will still break if you
have more than 200 context switches between data, '', data, '', etc. but
hopefully that should happen very, very, very rarely :-)
Hope this helps!
Original comment by kbacht...@gmail.com
on 20 Oct 2011 at 3:41
I can confirm that this exists and comment #3
(http://code.google.com/p/php-sql-parser/issues/detail?id=11#c3) does indeed
fix the issue.
Original comment by ben.swin...@gmail.com
on 13 Jan 2012 at 2:28
solution (comment #3) added to current version on
http://www.phosco.info/php-sql-parser_current.zip
Original comment by pho...@gmx.de
on 2 Feb 2012 at 8:25
@pho...@gmx.de
Not added to current version- Current version has
|('(?:[^']|'')*'+)
|("(?:[^"]|"a")*"+)
Whereas solution has
|('(?:[^']+|'')*'+)
|("(?:[^"]+|"")*"+)
Original comment by ben.swin...@gmail.com
on 5 Mar 2012 at 10:29
I have added a test with the code provided by jonny and it works. I have
changed the regular expression, so perhaps it works without your changes. Can
you provide a test code, which doesn't work?
Try the repository on https://www.phosco.info/publicsvn/php-sql-parser
I have no commit rights on the original codebase, so I have to provide the
changes on my own SVN.
Original comment by pho...@gmx.de
on 5 Mar 2012 at 12:08
Accepted fixed codebase.
Original comment by greenlion@gmail.com
on 12 Mar 2012 at 9:54
Original issue reported on code.google.com by
johnny.c...@gmail.com
on 20 May 2011 at 5:33