Closed Adsidera closed 2 years ago
In case this helps: The Nokogiri changelog lists this as a known breakage (part of a security fix):
- [CRuby] The libxml2 HTML parser in v2.9.14 recovers from some broken markup differently. Notably, the XML CDATA escape sequence
<![CDATA[
and incorrectly-opened comments will result in HTML text nodes starting with<!
instead of skipping the invalid tag. This behavior is a direct result of the quadratic-behavior fix noted above. The behavior of downstream sanitizers relying on this behavior will also change. Some tests describing the changed behavior are intest/html4/test_comments.rb
.
So apparently we're dealing with broken markup here? Is that intended? (I did not look into the Truncato code yet.)
FWIW, in a Rails context, tossing this in an initializer will avoid the bug:
silence_warnings do
Truncato::ARTIFICIAL_ROOT_NAME = "truncato-artificial-root".freeze
end
That will override the gem-defined value, which is probably an invalid tag name due to the underscores: https://github.com/jorgemanrubia/truncato/blob/7b93028ce9988810d3f95d513b7bc60f0a8fe7bd/lib/truncato/truncato.rb#L9
PR opened here https://github.com/jorgemanrubia/truncato/pull/21.
After upgrading Nokogiri to version 1.13.5 (or 1.13.6), we get this:
Can you please advise?