openzim / warc2zim

Command line tool to convert a file in the WARC format to a file in the ZIM format
https://pypi.org/project/warc2zim/
GNU General Public License v3.0
40 stars 5 forks source link

Fix logic of rewrite mode computation for cases raised in #326 #339

Closed benoit74 closed 4 days ago

benoit74 commented 5 days ago

Fix #326

benoit74 commented 5 days ago

Nota: I checked also few other task I had in mind and which have completed since 2.0.1 release to confirm there are no other known cases for now: https://farm.openzim.org/recipes/100rabbits, https://farm.openzim.org/recipes/website.test.openzim.org_en_all, https://farm.openzim.org/recipes/cloudflare.com_en_learning-center)

benoit74 commented 4 days ago

Thank you a lot for the remark, I've indeed added documentation (in technical_architecture.md) and opened a discussion with webrecorder to see if we can enhance the situation (things have changed a bit since the original discussions about the WARC-Resource-Type header): https://github.com/webrecorder/browsertrix-crawler/issues/630