webrecorder / pywb

Core Python Web Archiving Toolkit for replay and recording of web archives
https://pypi.python.org/pypi/pywb
GNU General Public License v3.0
1.34k stars 207 forks source link

Can't figure out how to make sequence #897

Open Twi-Hard opened 2 months ago

Twi-Hard commented 2 months ago

I've tried for many hours now to configure pywb to first try my collection of warcs, then fallback to archive.org then fallback to the live web. The documentation makes it seem like it should be possible but I can't figure it out at all. Does anybody know how? Thanks :)

tw4l commented 2 months ago

Hi @Twi-Hard, have you looked at this part of the pywb documentation?

https://pywb.readthedocs.io/en/latest/manual/warcserver.html?highlight=fallback#sequential-fallback-collections

Let us know if something is unclear or missing!

tw4l commented 2 months ago

If it's not clear from the context, the collection sequence should be configured in your config.yaml file.

Twi-Hard commented 2 months ago

I guess it does work. What confused me is I couldn't enter a random URL and have it load even if it wasn't archived locally. I know now it just lets you go to pages not archived locally after visiting a site that was archived locally. When testing by trying to access a site I didn't have it seemed like the config did nothing.

Is it possible to make it let me go to any site with the search bar thingy without needing to go to a site I have locally first?

Twi-Hard commented 2 months ago

It seems I can only access things from the first collection mentioned in the sequence but I have multiple collections so I can't access much of my archive with this config. I need a way for it to fallback to my second collection if the first collection doesn't have the site I'm looking for.

To clarify: The search doesn't show things not in the first collection. It seems the fallback works once I'm already on a site.