openzim / warc2zim

Command line tool to convert a file in the WARC format to a file in the ZIM format
https://pypi.org/project/warc2zim/
GNU General Public License v3.0
40 stars 5 forks source link

Remove the DS rewriting rules #328

Open benoit74 opened 1 week ago

benoit74 commented 1 week ago

See https://github.com/webrecorder/wabac.js/pull/182#issuecomment-2185726884 for the start of a discussion about the fact we might not need DS rewriting, it looks like this is already done in the crawler at crawl time.

benoit74 commented 5 days ago

Ilya confirmed that we do not need to have DS rewriting rule in warc2zim, it is already done in Browsertrix Crawler.

kelson42 commented 5 days ago

What is DS?

benoit74 commented 5 days ago

RTFM 🤣 https://github.com/openzim/warc2zim/blob/main/docs/functional_architecture.md#ds-rules 😇