mitodl / mitxonline

BSD 3-Clause "New" or "Revised" License
4 stars 2 forks source link

Improve resilience of posthog integration #2116

Open rhysyngsun opened 4 months ago

rhysyngsun commented 4 months ago

Description/Context

Posthog had an outage which caused calls to their API to timeout.

We worked around this by setting IN_TEST_SUITE=True and FEATURES_DEFAULT=True because we were fortunate enough to have all features enabled. It took some time to evaluate if this was the correct way to address the issue because IN_TEST_SUITE sounds like it could have other undesirable effects.

Plan/Design

pdpinch commented 3 months ago

How similar is this to the work @jkachel is doing on https://github.com/mitodl/mit-open/pull/693

collinpreston commented 3 months ago

@rhysyngsun what does "pushing us well over the 30 second Heroku request timeout." mean?

rhysyngsun commented 3 months ago

Heroku has a 30 second request timeout: https://devcenter.heroku.com/articles/request-timeout

collinpreston commented 2 months ago

This should not be automatically closed. This can be closed once the changes added in https://github.com/mitodl/ol-django/pull/152 are integrated with MITx Online.

collinpreston commented 1 month ago

Blocked by https://github.com/mitodl/ol-django/pull/156

pdpinch commented 1 month ago

Even though we made this change, we had another outage. Unlike the first outage, instead of PostHog timing out, it returned an error. So we need to add better error handling.