Can you include an example of your content and how you're sanitizing it, please?
Most likely, AngleSharp (the library which HtmlSanitizer uses to parse markup) is doing this because that's what browsers do - if you use a browser like Firefox, the dev tools will show you the actual markup which your browser is displaying, rather than what was given to it. For a table, that means a <tbody> is injected for any <tr>s present in the root <table> element (or the rows are merged into the existing tbody - been awhile since I've had to troubleshoot table issues, and it probably varies by vendor anyway).
The gist here is that if you think the behavior is wrong, it's unfortunately an upstream issue - AngleSharp follows browser behavior. Might be worth checking that project's issue board to see if anyone has a similar complaint and/or a fix.
I second what @tiesont said. FWIW here's the same example in the browser console:
let d = document.createElement("div");
d.innerHTML = '<table><tr><td>help</td></tr></table>';
d.innerHTML
-> '<table><tbody><tr><td>help</td></tr></tbody></table>'
Thanks, for taking the time to look into this. I'm just removing it after the sanitize process. I'm trying to write a tool to help people format html properly for email campaigns, part of that is HTML size, because a lot of companies charge egress bandwidth. If you have a big audience, those extra bytes tend to add up.
Not sure why it inserts
when it sanitizes, when it wasn't present before. What am I missing.Can you include an example of your content and how you're sanitizing it, please?
Most likely, AngleSharp (the library which HtmlSanitizer uses to parse markup) is doing this because that's what browsers do - if you use a browser like Firefox, the dev tools will show you the actual markup which your browser is displaying, rather than what was given to it. For a table, that means a
<tbody>
is injected for any<tr>
s present in the root<table>
element (or the rows are merged into the existing tbody - been awhile since I've had to troubleshoot table issues, and it probably varies by vendor anyway).This
<table><tr><td>help</td></tr></table>
is converted to<table><tbody><tr><td>help</td></tr></tbody></table>
In the command window
To me the tbody is not necessary, just takes up extra bandwidth. It doesn't cause any harm other then bandwidth.
The gist here is that if you think the behavior is wrong, it's unfortunately an upstream issue - AngleSharp follows browser behavior. Might be worth checking that project's issue board to see if anyone has a similar complaint and/or a fix.
I second what @tiesont said. FWIW here's the same example in the browser console:
Thanks, for taking the time to look into this. I'm just removing it after the sanitize process. I'm trying to write a tool to help people format html properly for email campaigns, part of that is HTML size, because a lot of companies charge egress bandwidth. If you have a big audience, those extra bytes tend to add up.