zzzprojects / html-agility-pack

Html Agility Pack (HAP) is a free and open-source HTML parser written in C# to read/write DOM and supports plain XPATH or XSLT. It is a .NET code library that allows you to parse "out of the web" HTML files.
https://html-agility-pack.net
MIT License
2.65k stars 375 forks source link

Attributes with spaces are not quoted if the attribute was loaded without quotes #568

Closed dylanstreb closed 2 months ago

dylanstreb commented 2 months ago

1. Description

Attributes with spaces are not quoted if the attribute was loaded without quotes

3. Fiddle or Project

Fiddle: https://dotnetfiddle.net/qj0Iav

The output from HAP includes <div class=test cls>Text</div>. This should be <div class="test cls">Text</div>. The browser will interpret the current output as <div class="test" cls>Text</div> - i.e. the class is only "test", then there's a new cls attribute with no value.

This is an issue with minified input. The quotes around attributes are stripped if it is valid to do so. When HAP loads a minified attribute, it attempts to preserve the quotation style. However, if the value is then changed, it does not check to make sure that the quotation style is still legal.

The example uses addClass but it should apply to anything that modifies attribute values. I believe other characters besides just space will require quotations. I don't know the exact list but probably any non-word character.

4. Any further technical details

JonathanMagnan commented 2 months ago

Hello @dylanstreb ,

Thank you for reporting. We will look at it.

Best Regards,

Jon

JonathanMagnan commented 2 months ago

Hello @dylanstreb ,

The v1.11.65 has been released, fixing this issue.

Let me know if everything is now working correctly.

Best Regards,

Jon

dylanstreb commented 2 months ago

1.11.65 fixes the issue. Thank you!