joeyism / py-edgar

A small library to access files from SEC's edgar
GNU General Public License v3.0
221 stars 52 forks source link

Read timed out error #17

Closed gregjasonroberts closed 4 years ago

gregjasonroberts commented 4 years ago

I'm trying to pull the past few 10-K documents for each company in the S&P but after a few successful company returns in a for-loop, typically works between 5 and 10 companies, I get the following error: requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='www.sec.gov', port=443): Read timed out. (read timeout=10)

*Is the request I'm making hitting the site too hard and excessive, or could something else be causing this issue? Would you recommend implementing either a sleep function or some other alternative?

joeyism commented 4 years ago

uh.. that may be on your machine. What happens if you ping www.sec.gov?

ppulipaka commented 4 years ago

Hi @gregjasonroberts, This might be helpful. Source: https://www.sec.gov/privacy.htm#security

We reserve the right to block IP addresses that submit excessive requests. Current guidelines limit users to a total of no more than 10 requests per second, regardless of the number of machines used to submit requests. If a user or application submits more than 10 requests per second, further requests from the IP address(es) may be limited for a brief period. Once the rate of requests has dropped below the threshold for 10 minutes, the user may resume accessing content on SEC.gov.

gregjasonroberts commented 4 years ago

Thanks Joey. It does appear to be an intermittent connection issue on my end.

@ppulipaka, appreciate the info on the current guidelines.