openzim / warc2zim

Command line tool to convert a file in the WARC format to a file in the ZIM format
https://pypi.org/project/warc2zim/
GNU General Public License v3.0
45 stars 4 forks source link

Incorrect HttpUrl scheme in value: data:application/javascript;base64,... #421

Open benoit74 opened 2 weeks ago

benoit74 commented 2 weeks ago

Task: https://farm.zimit.kiwix.org/pipeline/da6d20be-2425-4919-896a-6f128d456508/debug Page with problem: https://gwern.net/doc/www/www.astralcodexten.com/2012350b8562ddac309376a3b25e1a4ff3f9f648.html

benoit74 commented 1 week ago

Solution has been implemented in https://github.com/openzim/python-scraperlib/pull/216

This issue will be solved once upstream PR is merged, released and https://github.com/openzim/warc2zim/issues/411 is solved