qless-core currently stores a job's failure message as part of its data. For the most worker types, this means all failed jobs have exception backtraces and messages in their job data. We've seen thousands of jobs fail with identical failure messages, which bloats our memory usage considerably. Additionally, these keys never expire, so the only way to prevent running out of memory is to handle these errors manually.
I discussed this with @myronmarston, and we came up with a few changes we can make to solve this problem:
Move job failure messages into separate keys with their own expirations. That way, jobs that fail won't be dropped on the floor, but they won't use up nearly as much memory indefinitely.
Key the job failure messages by the hash of the message. This handles the duplicate error message issue.
qless-core currently stores a job's failure message as part of its data. For the most worker types, this means all failed jobs have exception backtraces and messages in their job data. We've seen thousands of jobs fail with identical failure messages, which bloats our memory usage considerably. Additionally, these keys never expire, so the only way to prevent running out of memory is to handle these errors manually.
I discussed this with @myronmarston, and we came up with a few changes we can make to solve this problem:
What do you think, @dlecocq?