automerge / pushpin

A collaborative corkboard app
BSD 3-Clause "New" or "Revised" License
627 stars 53 forks source link

UrlContent renders broken pages on some urls e.g. github #403

Open Gozala opened 3 years ago

Gozala commented 3 years ago

Me & @pvh have discussed this a bit on higher bandwidth channel, however since we were not able apply a simple fix I'm creating an issue here for further discussions.

We have identified problem to be caused by the fact that freeze-dry fails to fetch some resources e.g. https://github.githubassets.com/assets/frameworks-e9336e6b26b7848c1bbe761eacff9e7d.css%27 referenced from https://github.com/automerge/automerge duet CSP explicitly restricting that.

Which is why when captured page is rendered it lacks styling. It is unclear however why fetch used in preload script follows same CSP restrictions as the page (I was expecting it to run in more privileged context like extension).

Gozala commented 3 years ago

I think there are several options to go about addressing this issue so I would like to list them here to gather some additional feedback.

  1. Use built-in contents.savePage API, which according to the docs could save the page in mhtml format. It is not well supported file format, however it will probably do the job of saving page and then rendering it in the electron.
    • Tradeoff here is though that unlike freeze-dry it doesn't capture the DOM in a specific state, however since currently page is just loaded in the background and then capture result is probably going to be identical.
    • I am not sure how this impacts companion web-extension, however I imagine it's ok if extension captures in one format and app in the other.
  2. Learn from https://github.com/webrecorder/archiveweb.page and use chrome devtools protocol for capturing resources. E.g. webContents.debugger API provides access to the debugger protocol which seems to have Page.getResourceContent method that can be used to get page resources without having to fetch it. Seems bit complicated but might be worth it. Maybe this could even be used to e.g. connect to running chrome browser and capture content from there, although not sure it is any better than current extension setup.
  3. Do the freeze-dry-ing in node context. It's not clear how the current hidden webview thing came to be. I think it might be just fine capture things using jsdom in node instead. I have experience doing it & it seemed to work just fine for public content & anything behind session is not going to work with current setup anyway so this might be a simple yet practical option to go with.