Describe the bug
Dragonfly fails with OOM even when --cache_mode=true. This happens after multiple hours/days of intense load.
To Reproduce
We noticed this on several services that were processing ~200k commands/s and which had between 11k and 15k clients. Also, at the time of the OOM restart, the RSS memory was 162GB, while the used memory was 128GB.
Jun 08 13:56:10 dragonfly-1-4 systemd[1]: dragonfly.service: A process of this unit has been killed by the OOM killer.
Jun 08 13:56:38 dragonfly-1-4 systemd[1]: dragonfly.service: Main process exited, code=killed, status=9/KILL
Jun 08 13:56:38 dragonfly-1-4 systemd[1]: dragonfly.service: Failed with result 'oom-kill'.
Jun 08 13:56:38 dragonfly-1-4 systemd[1]: dragonfly.service: Consumed 2w 4d 4h 34min 1.671s CPU time.
Jun 08 13:56:53 dragonfly-1-4 systemd[1]: dragonfly.service: Scheduled restart job, restart counter is at 1.
Jun 08 13:56:53 dragonfly-1-4 systemd[1]: Stopped dragonfly.service - Aiven dragonfly in container.
Jun 08 13:56:53 dragonfly-1-4 systemd[1]: dragonfly.service: Consumed 2w 4d 4h 34min 1.671s CPU time.
Jun 08 13:56:53 dragonfly-1-4 systemd[1]: Started dragonfly.service - Aiven dragonfly in container.
Expected behavior
Dragonfly shouldn't crash.
Environment (please complete the following information):
Describe the bug Dragonfly fails with OOM even when
--cache_mode=true
. This happens after multiple hours/days of intense load.To Reproduce We noticed this on several services that were processing ~200k commands/s and which had between 11k and 15k clients. Also, at the time of the OOM restart, the RSS memory was 162GB, while the used memory was 128GB.
Expected behavior Dragonfly shouldn't crash.
Environment (please complete the following information):