Open MarkKharitonov opened 1 month ago
@MarkKharitonov Thank you for raising this issue.
In the current KubeBlocks Redis, the kb-post-provision-job-xxx
is primarily used to register Redis to all Redis Sentinel instances, enabling high availability capabilities for the Redis cluster.
Currently, the implementation of this job is not idempotent. When Redis successfully registers with some Redis Sentinel instances but fails to register with others (due to various reasons such as network connectivity issues or unhealthy instances), the post-provision-job fails and retries (which can also be triggered by deleting the job, as you mentioned).
When the job retries, the Sentinel instances that have already been successfully registered will return the error "ERR Duplicate master name." This is the reason behind the issue you encountered.
We will address this problem in the future by optimizing the Redis registration logic to make it idempotent.
Thank you again for bringing this to our attention.
This issue has been marked as stale because it has been open for 30 days with no activity
Describe the bug
To Reproduce Not sure, but for me it is reproduced very easily - I just need to delete the job to let it be created again and it errors out.
Expected behavior No errors.
Additional context I have 5 Redis instances deployed with KB, each with sentinels and each having 2 replicas for the database and 3 for the sentinels. Only one instance exhibits the problematic behavior: