webrecorder / wombat

Wombat.js client-side rewriting library
GNU Affero General Public License v3.0
81 stars 31 forks source link

Wombat incorrectly rewrites https://www.emsd.gov.hk/energylabel/tc/about/background2.html #123

Closed galgeek closed 11 months ago

galgeek commented 11 months ago

Captures' main content area fails to scroll as a result of this bug.

Observed both with Archive Web.Page / Replay Web.Page, as well as with https://web.archive.org/web/20221207105615/https://www.emsd.gov.hk/energylabel/en/about/background2.html

ikreymer commented 11 months ago

Appears to be a css issue -- changing of the elements from position: fixed to position: absolute seems to make it scrollable again.. Interesting that it happens both in iframe and frameless mode - wonder what is causing the fixed version to not calculate height correctly in certain cases..

ikreymer commented 11 months ago

Ah, nevermind, found the issue is due to a missing closing

tag, fix coming in branch that works in this case, but may not be generic enough. Issue is multiple document.write calls which split full html, eg.

document.write("<div><a href="...">...</a>")
document.write("<a href="...">...</a></div>")

These are called individually, so currently the first document.write() correctly prevents ending </div> tag from being added, the second call was also dropping the final </div> tag (due it seeming to be extra). The fix currently detects this.

Ideally, these calls could be grouped into one, but tricky to do, as there's no document.open() / document.close() involved if write() is called during load (as happens to be the case on this site). (See: https://web.archive.org/web/20221108171417js_/https://www.emsd.gov.hk/energylabel/assets/main/main.js)

ikreymer commented 11 months ago

Fixed via 3.6.0 release