Closed IclickButtons closed 7 years ago
Opened issue at newspaper repo https://github.com/codelucas/newspaper/issues/379
Will add a mode to the commoncrawl.py to ignore any error that might occur when processing all WARC files
commit 3c81c4392032f1715c2c03f98e3a1f3d95601107 adds an option that allows you to choose to continue in case of any error. just set continue_after_error = True
in commoncrawl.py
This should help until the issue is resolved in newspaper
Thanks for creating this issue - I'll investigate on newspaper's end
After running commoncrawl.py for like 15min it throws following error: