openzim / warc2zim

Command line tool to convert a file in the WARC format to a file in the ZIM format
https://pypi.org/project/warc2zim/
GNU General Public License v3.0
40 stars 5 forks source link

Automatic detection of encoding not used for JS, JSON (and CSS) files #301

Closed benoit74 closed 2 weeks ago

benoit74 commented 3 weeks ago

Currently, we have a custom detection of file encoding, but unfortunately this is used only for HTML content.

This is not used for JS and JSON content, making some of these rewrites fail.

For CSS, we let tinycss use its own logic to detect encoding.

We should use the same automatic detection of encoding for all files (including CSS for rationality).

benoit74 commented 3 weeks ago

This bug has caused the failure of: