Closed bogdankharchenko closed 3 weeks ago
@bogdankharchenko
With the added requests, the main concern is if we are likely going to trigger Meetup's REST API throttle unless we're playing by their rules.
The archived docs from Wayback Machine show the throttling, which appears to be about 30 with a 10 second reset, which I assume means we'd want to average less than 3 requests per second, if I'm understanding that correctly.
We triggered Meetup's IP throttle recently with stage and live running hourly, so I switched stage to less frequency to avoid the throttle triggering all the time.
They show their limit in the HTTP response headers and it's 2,000 per hour, so that's less of a concern.
Also, I think we talked about being able to configure the importer through .env but it was dismissed. It may be a good idea to allow some of these tasks to be configured through .env so we're not running live and stage at the same time, and so we can run things less frequently on stage. I'm currently doing this by tweaking stage's crontab, but it's less granular than if we used the task timing configuration in a dynamic way.
@allella we should use separate API keys for staging or turn off the scheduler.
@bogdankharchenko the Meetup REST API does not use API keys. It previously did, but once they sunset the API we moved to hitting the REST API without a key. Their throttling is IP address based, so it's just something we'll have to keep in mind.
Given that we've already hit the limit, it may be wise to add in some throttling, like maybe add a second delay between each request to see if that helps.
If we moved to the new GraphQL API it also has limits and it's unclear if we'd be able to get more than one API key, since it requires a Pro account and only the admin of that account seems to be able to apply for an OAuth key for GraphQL.
Adding a one second delay to the importer and purge tasks may be the simplest way to avoid throttles.
@bogdankharchenko or a better idea, perhaps make the throttle seconds delay configurable through .env so we can easily experiment with it and set different throttle times on prod vs stage.
This could be in addition to making the cron jobs more configurable, but I think that idea is less necessary if we are able to configure throttling.
This sets up a daily command to check all events to ensure the page returns a successful response - if it does not - it deletes the event from our database.
Resolves: https://github.com/hackgvl/hackgreenville-com/issues/238