gildas-lormeau / SingleFile

Web Extension for saving a faithful copy of a complete web page in a single HTML file
GNU Affero General Public License v3.0
15.68k stars 1.02k forks source link

Improper DOM Generation Prevents Correct HTML Rendering #1578

Open Lucacici opened 1 month ago

Lucacici commented 1 month ago

Some websites generate DOM structures in a non-standard manner, leading to HTML files that cannot be rendered correctly and require manual fixes.

Example

Problem The browser cannot render a p element containing a nested ul. Change p to div to render it correctly.

Environment

Request check if a p element contains any block elements, and replaces it with a div if necessary.

gildas-lormeau commented 1 month ago

The problem is that <div> is the tag of a block-level element and is also disallowed in a <p> tag. Only these tags are valid: https://developer.mozilla.org/en-US/docs/Web/HTML/Content_categories#phrasing_content.

Lucacici commented 1 month ago

The problem is that <div> is the tag of a block-level element and is also disallowed in a <p> tag. Only these tags are valid: https://developer.mozilla.org/en-US/docs/Web/HTML/Content_categories#phrasing_content.

Yes, I understand what you're saying.

What I mean is that the dynamic pages provided by the server generate invalid DOM, which can be correctly rendered by the browser, but the browser can't properly render static invalid DOM (such as saved webpages).

I'm not sure if I should clean up the mess for the website.