The behaviour of redis changed between version 6.0 and 6.2 when it comes to expiring keys while in the middle of a MULTI block execution (see https://github.com/redis/redis/pull/7920). This caused issues in our production environment since WATCH preventend the EXEC in create_bucket to succeed when some key expired. This resulted in a race condition that would prevent the creation of a new bucket time after time.
While trying to fix this behaviour we realized that the complexity added by WATCHing different keys drived this problem. Our intention is to remove the need for watch (used as an optimistic locking mechanism) and simplify this piece of code.
Removed the need for the Set data structure to keep track of buckets created every scale_ms milliseconds
Re-implemented the delete_buckets operation using a redis SCAN operation traversing all elements of the database.
Added a delete_buckets_timeout parameter that will timeout the client making the operation while the deletion of buckets will continue running in the server
Simplified the metadata stored in the buckets as the keys bucket and id were not returned on get_bucket
Unified the create_bucket and update_bucket operations by making use of HSETNX instead of checking previously for the existence of the bucket to decide on the action
Rewrote the tests to use a real redis running in a docker container through docker-compose.yml
Added github actions CI to run tests including a redis instance
Notes
Joint effort between @ricmarinovic, @chaodhib and myself
We are successfully running our fork in production
The behaviour of redis changed between version 6.0 and 6.2 when it comes to expiring keys while in the middle of a
MULTI
block execution (see https://github.com/redis/redis/pull/7920). This caused issues in our production environment sinceWATCH
preventend theEXEC
increate_bucket
to succeed when some key expired. This resulted in a race condition that would prevent the creation of a new bucket time after time.While trying to fix this behaviour we realized that the complexity added by
WATCH
ing different keys drived this problem. Our intention is to remove the need for watch (used as an optimistic locking mechanism) and simplify this piece of code.Fixes: https://github.com/ExHammer/hammer-backend-redis/issues/26
Changes
scale_ms
millisecondsdelete_buckets
operation using a redisSCAN
operation traversing all elements of the database.delete_buckets_timeout
parameter that will timeout the client making the operation while the deletion of buckets will continue running in the serverbucket
andid
were not returned onget_bucket
create_bucket
andupdate_bucket
operations by making use of HSETNX instead of checking previously for the existence of the bucket to decide on the actionNotes