Open thomasegense opened 1 year ago
It is worth testing how much speed up is gained by not recalculating SHA-1 hash and trust the WARC-header instead. Notice for old ARC files, we still have to calculate the hash.
This issue should be moved to the webarchive-discovery project.
It is worth testing how much speed up is gained by not recalculating SHA-1 hash and trust the WARC-header instead. Notice for old ARC files, we still have to calculate the hash.