Add retry logic to backpressure redis clusters

kneeyo1 commented 22 hours ago

Backpressure does not reinitalize redis cluster or single node redis connections on timeouts.

Some sort of maintenance event / replication failover in our redis cluster meant that an old host/port combo configured in our backpressure was no longer pointing to our active cluster. Backpressure started receiving timeouts when trying to get metrics, marking cluster as unhealthy. This should force it to reinit on timeout.

relates to https://github.com/getsentry/sentry-redis-tools/pull/18

lynnagara commented 3 hours ago

should this be a backpressure-specific thing?

shouldn't redis cluster code behave this way by default everywhere?

codecov[bot] commented 2 hours ago

:x: 1487 Tests Failed:

Tests completed	Failed	Passed	Skipped
23116	1487	21629	215

View the top 3 failed tests by shortest run time

> > ```python > tests.sentry.hybridcloud.services.test_control_organization_provisioning.TestControlOrganizationProvisioningSlugUpdates__InControlMode::test_conflicting_unregistered_organization_with_slug_exists > ``` > >

Stack Traces | 0.005s run time

> > > > > ```python > > No failure message available > > ``` > >

tests.sentry.users.api.endpoints.test_user_authenticator_details.UserAuthenticatorDetailsTest::test_sms_get_phone
Stack Traces | 0.005s run time
> > ```python > No failure message available > ```
tests.sentry.users.web.test_accounts.TestAccounts::test_post_success
Stack Traces | 0.005s run time
> > ```python > No failure message available > ```

To view more test analytics, go to the Test Analytics Dashboard Got feedback? Let us know on Github

getsentry / sentry

Add retry logic to backpressure redis clusters #81102

:x: 1487 Tests Failed: