netarchivesuite / solrwayback

A search interface and wayback machine for the UKWA Solr based warc-indexer framework.
Apache License 2.0
102 stars 21 forks source link

Make binary resolving more flexible #356

Open tokee opened 1 year ago

tokee commented 1 year ago

The current abstraction of resource (WARC records) resolving expects [WARC-filename, offset]. By extending this to [WARC-filename, offset, timestamp, URL] it should be possible to use PyWB as backend for resolving. This would work well for setups that already have PyWB running and don't want to add a new exposure point for their WARCs.

Suggested by Ben O'Brien

thomasegense commented 1 year ago

Basically implement the CDX-index API.

Notice this would make #388 obsolete.

tokee commented 1 year ago

No, quite the reverse: This allows SolrWayback to use an existing CDX server for resource retrieval.