WebMemex / freeze-dry

Snapshots a web page to get it as a static, self-contained HTML document.
https://freezedry.webmemex.org
The Unlicense
268 stars 18 forks source link

Allow passing custom grabber for frame contents #24

Open Treora opened 6 years ago

Treora commented 6 years ago

As explained in src/Readme:

Although we try to clone each Document living inside a frame (recursively), it may be impossible to access these inner documents because of the browser's single origin policy. If the document inside a frame cannot be accessed, its current state cannot be captured. ... When freeze-dry is run from a more privileged environment, such as a browser extension, it could work around the single origin policy. A future improvement would be to allow providing a custom function getDocInFrame(element) to enable such workarounds.

Treora commented 6 years ago

Some notes to self (or others who wish to implement this):

The current source code suggests that the grabber function would, given an (i)frame element, give back a DOM; the default is to try read frameElement.contentDocument (lines). I assumed a grabber function would call some privileged code (a content script inside the frame) to access the iframe's DOM and return that, possibly after having serialised&messaged&parsed it again.

However, the serialising&messaging&parsing would break the living parts, preventing us from then accessing any frames within the frame, as well as canvas data (#18) and form state (#19).

Some possible solutions: