cesanta / v7

Embedded JavaScript engine for C/C++
Other
1.43k stars 177 forks source link

RegExp parsing fails on character classes which contain a match for NUL character #560

Open Fordi opened 8 years ago

Fordi commented 8 years ago

Attached jslint.js (zipped) jslint.js.zip

Version: ea633dda0674c7a8cf2dfd02de9059746a67e396

Fordi commented 8 years ago

Tracked down the offending regexp:

var rx_unsafe = /[\u0000-\u001f\u007f-\u009f\u00ad\u0600-\u0604\u070f\u17b4\u17b5\u200c-\u200f\u2028-\u202f\u2060-\u206f\ufeff\ufff0-\uffff]/;

Can get the same result with ./v7 -e '/[\x00]/'

Updating ticket title to reflect real bug.

Fordi commented 8 years ago

I was able to locate the error message, and piece out that slre_compile is returning SLRE_MALFORMED_CHARSET, which implies to me that the escapes are getting preprocessed somewhere? I don't know how to run a debugger in C, so that's about as far as I got, since I got no stack.

goniz commented 8 years ago

I've encountered this today as well.. this code seems to be present in pure js json parsers as well and gets broken by this issue.

Fordi commented 8 years ago

Seems to come down to the fact that "Rune" is a uint_16 and slre_env uses slre_env->curr_rune == 0 as an error flag.

Fordi commented 8 years ago

Another RX that throws SLRE_MALFORMED_CHARSET:

/[`\\]/