zzzprojects / html-agility-pack

Html Agility Pack (HAP) is a free and open-source HTML parser written in C# to read/write DOM and supports plain XPATH or XSLT. It is a .NET code library that allows you to parse "out of the web" HTML files.
https://html-agility-pack.net
MIT License
2.63k stars 375 forks source link

The html rendering result is different from the html output result when tbody is added inside unclosed th #530

Closed pauloortins closed 8 months ago

pauloortins commented 8 months ago

When we have a tbodytag inside a unclosed ththen the output is different from the way chrome renders it. While closes the entire thead, HAP adds the tbodytag inside the th.

HTML: <table><thead><tr><th></th><th></th><th><tbody></table> HAP Result: <table><thead><tr><th></th><th></th><th><tbody></tbody></th></tr></thead></table> Chrome Result: <table><thead><tr><th></th><th></th><th></th></tr></thead><tbody></tbody></table>

This is the test I used:

var html = @"<table><thead><tr><th></th><th></th><th><tbody></table>";
var doc = new HtmlDocument();
doc.LoadHtml(html);
var newHtml = doc.DocumentNode.OuterHtml;  //<table><thead><tr><th></th><th></th><th><tbody></tbody></th></tr></thead></table>
var res = @"<table><thead><tr><th></th><th></th><th></th></tr></thead><tbody></tbody></table>"; //chrome or edge rendering results
var b = newHtml == res; //false
JonathanMagnan commented 8 months ago

Thank you for reporting.

The fix has been done and will be available the next time we deploy Html Agility Pack

Best Regards,

Jon

JonathanMagnan commented 8 months ago

Hello @pauloortins ,

The v1.11.58 has been released.

The issue you reported should now be fixed.

Best Regards,

Jon

pauloortins commented 8 months ago

Thank you Jon!

On Mon, Jan 29, 2024 at 2:51 AM Jonathan Magnan @.***> wrote:

Hello @pauloortins https://github.com/pauloortins ,

The v1.11.58 has been released.

The issue you reported should now be fixed.

Best Regards,

Jon

— Reply to this email directly, view it on GitHub https://github.com/zzzprojects/html-agility-pack/issues/530#issuecomment-1914007007, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIXWK6HG75NIQY5LFJ4MJ3YQ42FJAVCNFSM6AAAAABBOUWTY2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJUGAYDOMBQG4 . You are receiving this because you were mentioned.Message ID: @.***>

-- Paulo Ortins - www.pauloortins.com Tel.: (71) 9354-4577