Open achimr opened 7 years ago
locate_candidates_cc_index_api.py doesn't rate limit its queries to the CommonCrawl index server http://index.commoncrawl.org. The server is reported to be under heavy load frequently https://groups.google.com/forum/#!topic/common-crawl/o_MuZViu0O0. We should be nice and rate-limit our queries.
Workaround: run our own index server (see description how to in the mailing list thread)
locate_candidates_cc_index_api.py doesn't rate limit its queries to the CommonCrawl index server http://index.commoncrawl.org. The server is reported to be under heavy load frequently https://groups.google.com/forum/#!topic/common-crawl/o_MuZViu0O0. We should be nice and rate-limit our queries.
Workaround: run our own index server (see description how to in the mailing list thread)