webrecorder / pywb

Core Python Web Archiving Toolkit for replay and recording of web archives
https://pypi.python.org/pypi/pywb
GNU General Public License v3.0
1.34k stars 207 forks source link

Provide clean way to inject data into replayed pages #832

Open despens opened 1 year ago

despens commented 1 year ago

Problem I am trying to solve

To make access into an archive more user friendly, I need to inject some CSS and JavaScript into the head of every page. The pywb documentation states that this can be done via the template head_insert.html however changing it isn't recommended. Use cases are similar to what was proposed via periphery.

Indeed head_insert.html contains wombat related code that I would like avoid to change or keep in sync with updated versions of pywb.

Solution I would like to see

I would be great to have a clean and approved way of injecting code into a replayed page. Perhaps with a separate optional template like html_head_inject.html and html_body_inject.html, ideally with the whole wbrequest and metadata objects available.

Alternatives I considered

Manually updating every collection that uses a customized header inject on every pywb update. 🫤

Additional context

I have to take care about lots of individual collections :smiley:

despens commented 1 year ago

Hello again, colleagues and I were just thinking through this again. Most of periphery's functionality could be implemented via with existing pywb functionality as soon as

It would be good to use for

This is metadata that would be well placed in a collection's metadata.yaml file and be interpreted during replay by a template and client-side JavaScript. Yaml makes for a great format to be edited manually by curators (as this work needs to be done manually).

(This shouldn't be too hard to make work on fully client-side replay as well once we arrive at a meaningful structure. Periphery is a great starting point already.)