At the time of writing, oh-bugimporters has difficulty downloading
all the bugs it wants to from github.com.
@ehashman discovered that GitHub throttles API requests after
5000 per hour.
The Scrapy DOWNLOAD_DELAY setting affects only "consecutive pages
from the same website", so we should still see a sizeable amount
of parallelism in our crawling after this change. However,
since this setting applies to all domains, we might still
see a general slowdown.
At the time of writing, oh-bugimporters has difficulty downloading all the bugs it wants to from github.com.
@ehashman discovered that GitHub throttles API requests after 5000 per hour.
The Scrapy DOWNLOAD_DELAY setting affects only "consecutive pages from the same website", so we should still see a sizeable amount of parallelism in our crawling after this change. However, since this setting applies to all domains, we might still see a general slowdown.