Open nikhiljha opened 2 years ago
potential approaches:
approaches that don't work:
(also static wp means we can put things on a cdn now? do we need one?)
9/27 simply-static isn't compatible with existing wordpress hosting setup, considering to move to https://pywb.readthedocs.io/en/latest/index.html
OIP coming soon
HTTrack doesnt seem to do recursive (ie subpages on the site are not downloaded). pywb results in an installation error that needs to be debugged (something wrong with compiling during pip install) - also results in a wacz file that we need to find a way to convert to HTML.
resources here: https://github.com/iipc/awesome-web-archiving/blob/main/README.md that might be helpful. Looks like we need to explore other tools
HTTrack can definitely do recursive, but it's probably not worth using because it's so old (and it's a windows program so running it is logistically difficult)
I think it's probably not to bad to show a web-archive-format thing publicly? I'm not sure. 🤔
Throw all the wordpress instances behind keycloak proxy, force them all to render to static sites via https://wordpress.org/plugins/simply-static/ or similar