Redis DB is full of unprocessed events

sander85 commented 1 year ago

Self-Hosted Version

22.10.0

CPU Architecture

x86_64

Docker Version

20.10.12

Docker Compose Version

1.29.2

Steps to Reproduce

Just use Sentry

Expected Result

Redis shouldn't have too many keys in the database.

Actual Result

Since upgrading to 22.10.0 we see a lot of keys in the redis database. And the number grows pretty fast if DB is not flushed. This didn't happen with previous version of Sentry. Many of the keys started with c: or something like that. I opened one and it seemed like an issue with event_id that I couldn't find from the Sentry UI. So it seems that some component is not picking up the keys from redis to transfer them to permanent storage or something like that. Our PostgreSQL and Redis are configured outside of the docker-compose setup. But some of the new events are saved and can be seen in the UI. So not all of the events go missing.

Event ID

No response

sander85 commented 1 year ago

Forgot to add that I can't see anything interesting from any of the logs that I can access. No errors nor warnings. Just randomly part of the events remain in redis.

hubertdeng123 commented 1 year ago

Thanks for bringing this up, will have to do some investigation here

github-actions[bot] commented 1 year ago

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!

"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

sander85 commented 1 year ago

The issue hasn't been fixed. I don't have any more info to input either as I don't know which part is responsible for cleaning up redis.

igorcoding commented 1 year ago

We started to face the same issue on version 22.8.0 for no reason whatsoever. No upgrades, no downtime, just for some reason redis started consuming more and more RAM. I confirm that the problem is that event keys are not being deleted. Unfortunately, I cannot understand what had caused this issue, because there are no errors or warning in logs in sentry processes. It looks like some sort of failure in tasks processing, but again - can't get to the root of it. Here is output of redis-cli --bigkeys: telegram-cloud-photo-size-2-5188550949084447672-y I've been thinking that maybe try to update sentry, but as I can see the problem is with newer versions as well.

TheBlusky commented 1 year ago

Same issue here. Is there any workaround / task to run to free memory ?

robkorv commented 1 year ago

Same issue here. Is there any workaround / task to run to free memory ?

See https://github.com/getsentry/self-hosted/issues/1796#issuecomment-1304859942

le0pard commented 1 year ago

we start getting same issue with redis after update from 22.12.0 to 23.2.0 (on this update we switched to aws s3 for Node Storage)

hheexx commented 1 year ago

Same problem here happening again on 23.2.0 also

namco1992 commented 1 year ago

The fix here https://github.com/getsentry/self-hosted/pull/1817 fixed our problem, maybe you can try and see if it works for you.

le0pard commented 1 year ago

@namco1992 it was merged, so it already in yml file, which was fetched for 23.2.0 update

hheexx commented 1 year ago

This one worked for me: https://github.com/getsentry/self-hosted/issues/1796#issuecomment-1304859942

le0pard commented 1 year ago

Yes, but this is not a fix for bug. It just hide problem by removing redis keys, when it is not time to remove it (we don't have space for new keys)

le0pard commented 1 year ago

I did some monitoring for redis. Hard to say the reason of memory growth (too many events), but here interesting pattern:

Insert some keys with pattern "c:1:e::"

"SETEX" "c:1:e:aa4223d9fb9b4acf953868aa04d00e87:46" "86400" "{\"event_id\":\"aa4223d9fb9b4acf953868aa04d00e87\"

After this get and delete data by key with additional prefix ":a"

"GET" "c:1:e:aa4223d9fb9b4acf953868aa04d00e87:46:a"
"DEL" "c:1:e:aa4223d9fb9b4acf953868aa04d00e87:46:a"

After some time some process read this key:

"GET" "c:1:e:aa4223d9fb9b4acf953868aa04d00e87:46"

But nobody cleanup it. So we need wait 24 hours (based on "setex" value) to redis itself cleanup this events. And this events is not small, sometimes ~200 Kbytes in size

Here number of such keys growing over time (10 seconds between each command run) in redis

$ docker-compose exec redis redis-cli --scan --pattern 'c:1:e:*' | wc -l
162070
$ docker-compose exec redis redis-cli --scan --pattern 'c:1:e:*' | wc -l
162136
$ docker-compose exec redis redis-cli --scan --pattern 'c:1:e:*' | wc -l
162205

darklow commented 1 year ago

If you don't really care about anything unprocessed and encounter this issue when redis eats up all memory and eventually crashes whole on-premise deployment, what helped me was following:

docker exec -it sentry-self-hosted_redis_1 redis-cli flushall

which basically deletes anything on redis. We had couple of projects creating hundreds of k of issues which lead to disk space issues and after disk & memory resize, sentry services kept crashing anyway until I noticed that it is redis that eats up so much memory and cleaning all the backlog helped.

meotimdihia commented 9 months ago

Redis 150GB+
Do we have any command to clean up Redis? I don't need old events :) docker exec -it sentry-self-hosted_redis_1 redis-cli flushall doesn't work for me. I cleaned up Kafka but don't know how to do for REdis

azaslavsky commented 9 months ago

docker exec -it sentry-self-hosted_redis_1 redis-cli flushall doesn't work for me.

Hmm, just tried this on a local instance and it seemed to work? Have you tried scrubbing the docker volume backing redis?

meotimdihia commented 9 months ago

@azaslavsky I could be mistaken it looks like the command docker exec -it sentry-self-hosted_redis_1 redis-cli flushall works again for me.

getsentry / self-hosted