oalders / html-restrict

HTML::Restrict - Strip away unwanted HTML tags
Other
10 stars 9 forks source link

Ensure stripper stack is always reset #8

Closed evoyy closed 11 years ago

evoyy commented 11 years ago

I have found a bug in the strip_enclosed_content implementation. When deleting tags from the stripper stack in _delete_tag_from_stack(), it is assumed that the HTML within those tags is valid. However, if the HTML is broken, then the stack will not be cleared between calls to process(). Not only will this result in the $hr object becoming useless, but for every call to $hr->process the stripper stack will grow continuously, like this:

["script"] ["script", "script"] ["script", "script", "script"] ["script", "script", "script", "script"] ["script", "script", "script", "script", "script"] ["script", "script", "script", "script", "script", "script"]

As a test case, try this:

use Data::Dump;
use HTML::Restrict;

my $hr = HTML::Restrict->new;

dd $hr->process('xxxxxxx');        # "xxxxxxx"
dd $hr->process('<script < b >');  # undef
dd $hr->process('xxxxxxx');        # undef
dd $hr->process('xxxxxxx');        # undef
dd $hr->process('xxxxxxx');        # undef
oalders commented 11 years ago

Thanks for fixing this. There was a merge conflict, but I've handled that manually. :)