Open sentry-io[bot] opened 3 years ago
Interestingly, I would point out that the error only appears to affect dockets.
Hm, troubling. I don't know why it'd only affect dockets, nor why anything would be different here. The only change to networking that's relevant is that our python code is now dockerized, so it has to go through some network hoops to arrive at redis.
One thing that has changed with redis is that it has about 17GB of stuff in it all of a sudden, instead of the paltry amounts it had before. That 17GB of stuff is related to the 1.1M failed IA uploads, since most failed celery tasks get stored in Redis for some number of hours. 17GB still isn't much in the scheme of things, but maybe it's playing a role.
This one seem hard to diagnose and fix. One strategy could be retries, but those have their own issues.
Via the Sentry timeline it looks like this was resolved. I think the resolution was to clear out redis in issue #1460.
This issue is still kind of around sometimes, though it's infrequent. The solution seems to be to get health_check_interval
landed into django-redis-cache. The issue for that is closed atm, but it's here: https://github.com/sebleier/django-redis-cache/issues/184
We've seen an explosion in Redis connections failing with the python3 conversion.
Sentry Issue: COURTLISTENER-H4