Moved back to splitting pages detection and CDX generation in 0.0.6 after we discovered an off-by-one error affecting the results .
This temporary fix is affecting performance (all WARCs are iterated over twice instead of once), but is likely temporary (maybe a simple programming mistake on my part?).
Moved back to splitting pages detection and CDX generation in 0.0.6 after we discovered an off-by-one error affecting the results . This temporary fix is affecting performance (all WARCs are iterated over twice instead of once), but is likely temporary (maybe a simple programming mistake on my part?).
Underlying issue: https://github.com/webrecorder/warcio.js/issues/52