voku / anti-xss

㊙️ AntiXSS | Protection against Cross-site scripting (XSS) via PHP
MIT License
680 stars 106 forks source link

JSON Encoded HTML attribute issues #143

Open breconwhite opened 8 months ago

breconwhite commented 8 months ago

What is this feature about (expected vs actual behaviour)?

When HTML is sent as part of a JSON request, xss_clean has some issues with quotations. Specifically having issues with anchor tag attributes being stripped away with the _filter_attributes function when HTML is nested and there are double-escaped quotations.

e.g. => A JSON string like "{\"text\": \"<a href=\\\"https://google.com\\\">Google</a>\"}" returns as {\"text\": \"<a >Google</a>\"}"

Any additional information?

I think this could possibly be solved by updating the regex on line 995 in _filterattributes to include \" as a potential attribute quote. Maybe by updating the capture group to ("|\'|\") as follows `'#\s*[\p{L}\d-[]]+\s=\s("|\'|\")(?:[^\1]*?)\1#u'`