Open dedene opened 12 months ago
(the force-push: I've squashed my changes up till now in a single commit)
Thanks for submitting this! It may be a day or two before I'm able to review.
@dedene Thank you for your patience!
So I'm proceeding carefully here for the moment, since Rails::HTML::Sanitizer is sensitive to empty/boolean attributes. See https://github.com/rails/rails-html-sanitizer/pull/136 for the original description of the general problem back in June 2022.
I had been waiting for HTML5 parsing to land in the sanitizer stack before tackling some of these behavioral edge cases. This PR might be the right answer, but I want to try to see if we can get the underlying parser (libgumbo) to do the right thing here first.
All of which is to say: I'm going to play with this for a bit.
We are using Loofah in a number of projects where the scrubbing of empty attributes of boolean attributes became an issue. This PR adds support for boolean attributes or empty string values on certain node types. It fixes #242.
I.e.
<option value="">Empty Value</option>
is a perfectly safe html, but the empty value was stripped when using the scrubber.It also adds support for boolean attributes (i.e.
download
on an<a>
element, orautoplay
on a<video>
tag. I could not get Nokogiri to output it as a boolean attributes, but the html5 specification (section 3.2.2) specifies that empty string is also fine.The behaviour from https://github.com/flavorjones/loofah/pull/51 is still the same, so the risk for unwanted regressions is minimal imho.
The tests on Github Action seem to fail for truffleruby. But that seems to be related to https://github.com/ruby/stringio/pull/71 which just got merged and not related to the actual code changes in this PR.
Feel free to make or suggest changes if needed. Thanks a lot for having a look at this!