webrecorder / pywb

Core Python Web Archiving Toolkit for replay and recording of web archives
https://pypi.python.org/pypi/pywb
GNU General Public License v3.0
1.34k stars 207 forks source link

Sort index when adding wacz archives #820

Closed kuechensofa closed 7 months ago

kuechensofa commented 1 year ago

Description

Ensure that collection index is sorted when adding wacz archives to the collection.

Motivation and Context

CDXJ indices must be sorted for the binary search algorithm to work. When adding wacz archives and merging there indices with the collection index, the wacz index was just appended to the end of the collection index and the collection index wasn't sorted anymore.

Screenshots (if appropriate):

Types of changes

Checklist:

Quirinus commented 2 months ago

This is great, thanks!