ezyang / htmlpurifier

Standards compliant HTML filter written in PHP
http://htmlpurifier.org
GNU Lesser General Public License v2.1
3.07k stars 327 forks source link

rel attribute in anchor tag #172

Open moezkorkmaz opened 6 years ago

moezkorkmaz commented 6 years ago

In some cases the purifier modifies the snippet even though it IS valid registering changes (=errors) in the ErrorCollector.

Example 1:

original snippet: <a href="http://www.google.de" target="_blank" rel="noreferrer noopener">testlink</a> purified snippet: <a href="http://www.google.de" target="_blank" rel="noreferrer noopener">testlink</a>

In the case above, the purifier logs following errors:

  1. rel attribute on removed
  2. Attributes on transformed from href and target to href, target and rel
  3. Attributes on transformed from href, target and rel to href, target and rel

Example 2:

Besides this, changing the order of values in the rel attribute also returns the same errors: <a href="http://www.google.de" target="_blank" rel="noopener noreferrer">testlink</a>

ezyang commented 6 years ago

Thanks for the report. You didn't post your full config but I am guessing you are using the rel/target adder filter? The error collector is "technically" correct in this case; internally there really is a removal (because rel/target are not allowed) and then an addition (because post-validation we add some otherwise forbidden elements.)

I can't think of a clean way to resolve this issue. You might be able to fix it on your end by making target/rel "allowed" attributes, for the values you want to accept them.

moezkorkmaz commented 6 years ago

What about checking for attributes (in this case "rel" attribute and it's value) before and after manipulating them and then see for actual changes as a very naive implementation? This is what I actually do now as a workaround (my codebase, no fork here).