DanMcInerney / xsscrapy

XSS spider - 66/66 wavsep XSS detected
1.64k stars 438 forks source link

false positives on pages with long single lines #9

Closed DanMcInerney closed 10 years ago

DanMcInerney commented 10 years ago

Just need to figure out how to get an lxml Element's index from within either the doc or just that line. Working on it.

DanMcInerney commented 10 years ago

almost done

DanMcInerney commented 10 years ago

All done. Explanation of changes:

-Previously sent 1 request with a single unpayloaded test string to all the places an XSS could hide, checked for injection in the response and if found, sent an actual payloaded test string to the location and used the lxml data off the original request to match with the new payloaded requests' regex matches of the injection points. Now it just sends one payloaded test string to each location an XSS could hide and when analyzing the response, replaces the HTML characters then uses lxml to parse the attribute, location, and tag of the injection point. Huge resource reduction there.

-Much much improved detection of which quotes are important. This is difficult since html attribute delimiter quotes can be both " and ' in a single page (but not a single html element) and on top of that javascript can do the same thing; be both ' and " on a single page. Now the script finds the injection point, moves back to the parent html tag, then analyzes the strings from the parent tag until the injection point to determine what quote, if any, is a breakout quote.