HtmlUnit / htmlunit-neko

HtmlUnit adaptation of NekoHtml
Apache License 2.0
17 stars 13 forks source link

Frameset not added to DOM in some malformed HTML #115

Open duonglaiquang opened 4 months ago

duonglaiquang commented 4 months ago

Note: This issue is a migration (for our convenience) of this issue on sourceforge which can now be closed.

Problem in brief

<frameset> is lost and not added to the DOM in some malformed HTML when it should be.

Examples

These examples demonstate the issue with input HTMLs and their corresponding expected DOM and what HtmlUnit produces.

# Input HTML Expected DOM HtmlUnit's DOM
1 ```html
```
```html ``` ```html
```
2 ```html
```
Same as above. Same as above.
3 ```html
```
Same as above. ```html
```

Note: These already exist as test cases in org.htmlunit.html.parser.MalformedHtmlTest as siblingWithoutContentBeforeFrameset(), framesetInsideForm(), as well as others not covered above.

Remarks

rbri commented 4 months ago

@duonglaiquang @atnak i have an idea how to solve this. The problem here is that we have some operations requiring major changes in the dom tree that was already constructed. This is not a problem if you have a dom tree. But neko also supports the sax interface - there is no chance to do that.

Let me think a bit.

And sorry for not being that fast with all your stuff, i have some important private things to take care of.