CivicTechTO / ttc_subway_times

A scraper to grab and publish TTC subway arrival times.
GNU General Public License v3.0
40 stars 30 forks source link

API Throttling #54

Closed radumas closed 5 years ago

radumas commented 5 years ago

We've noticed that the API seems to lock us out if we use the async method of sending requests. This appears to send too many requests too quickly.

The current serverless version of the data pipeline is using serial requests to the API instead and that seems fine.

Dunno if there's a way to have a sleep timer on the async, which seems a liiiiittle counter-intuitive.

perobertson commented 5 years ago

I believe you can specify a pool size with async io. If we limit it to only a couple requests then we will still get a benefit without overloading their server.