iipc / jwarc

Java library for reading and writing WARC files with a typed API
Apache License 2.0
48 stars 8 forks source link

CDX indexer: support revisit records #71

Closed ato closed 1 year ago

ato commented 1 year ago

It looks like the Pywb indexer indicates these by setting the mime type field to "warc/revisit". Presumably we should follow that. Currently the indexer just ignores revisit records entirely.