yaoweibin / ngx_http_substitutions_filter_module

a filter module which can do both regular expression and fixed string substitutions for nginx
607 stars 213 forks source link

Only the last match in a line is catched #3

Closed gfdsa closed 11 years ago

gfdsa commented 11 years ago

Hello,

Thanks for the great work on the module!

Running master on nginx/1.2.5 I've noticed this:

my rule:

subs_filter (?<=<img)(.*)(?<=src=)('|")http://www.domain.com/(.+?)(?!php)('|") $1$2http://cdn.domain.com/$3$4 ir;

my original text:

<img src="http://www.domain.com/blah1.png">
<img src="http://www.domain.com/blah2.png">
<img src="http://www.domain.com/blah3.png"><img src="http://www.domain.com"><img src="http://www.domain.com/blah5.png">

result from nginx with the rule applied:

<img src="http://cdn.domain.com/blah1.png">
<img src="http://cdn.domain.com/blah2.png">
<img src="http://domain.com/blah3.png"><img src="http://domain.com/blah4.png"><img src="http://cdn.domain.com/blah5.png">

this is the same with or without 'g' option and if I add the same subs_filter twice it does two substitutions per line. I am not good at C, less at parsing text in it. tried to trace it in the code but can't really figure out where the matches limit comes from.

gfdsa commented 11 years ago

hmmm, stupid me assuming things, sorry. rewrote it like this:

subs_filter <img(.+?)src=("|')http://www.domain.com/([^"']+)\.(jpeg|jpg|png|gif) <img$1src=$2http://cdn.domain.com/$3.$4 ir;

and works as expected. so apparently I there is some catch with look ahead/behind part ...