dgunning / edgartools

Python library for working with SEC Edgar
MIT License
324 stars 70 forks source link

rate limits #37

Closed jackhhchan closed 3 months ago

jackhhchan commented 3 months ago

Hi, does an internal mechanism exist to track the number of requests to match the rate limit imposed by data.sec.gov?

The limit is currently 10 requests/second.

https://www.sec.gov/oit/announcement/new-rate-control-limits

Thanks!

dgunning commented 3 months ago

Good question. The answer is sort of, though I realize now that it is fairly crude and doesn't directly handle some scenarios.

  1. The httpx.Client is set with a max connection pool of 10 and a timeout of 12 seconds. This is the NORMAL node. There are two more restrictive modes

# Modes of accessing edgar

# The normal mode of accessing edgar
NORMAL = EdgarSettings(http_timeout=12, max_connections=10)

# A bit more cautious mode of accessing edgar
CAUTION = EdgarSettings(http_timeout=15, max_connections=5)

# Use this setting when you have long-running jobs and want to avoid breaching Edgar limits
CRAWL = EdgarSettings(http_timeout=20, max_connections=2, retries=2)

# Use normal mode
edgar_mode = NORMAL

Now this helps in cases where the library uses a client inside a with block and so reuses the http connection pool. However, this isn't guaranteed across the library.

In practice you won't hit the limit on a single thread unless you have an extremely fast CPU.

But a more robust mechanism is planned

dgunning commented 3 months ago

Closing. Will open a new PR for the performance improvements