Closed benoit74 closed 2 weeks ago
In warc2zim, we have a tooling which confirms that:
This is run at the beginning of the scraper, to avoid loosing time scraping stuff and then failing many hours later.
https://github.com/openzim/warc2zim/blob/32d1a20e5df425ed737f07d91155978fe0a92bed/src/warc2zim/converter.py#L127-L162
Sharing this logic in zimscraperlib would be great since all scrapers should do these checks indeed.
In warc2zim, we have a tooling which confirms that:
This is run at the beginning of the scraper, to avoid loosing time scraping stuff and then failing many hours later.
https://github.com/openzim/warc2zim/blob/32d1a20e5df425ed737f07d91155978fe0a92bed/src/warc2zim/converter.py#L127-L162
Sharing this logic in zimscraperlib would be great since all scrapers should do these checks indeed.