Closed UncleVic closed 3 months ago
Hi @UncleVic, which rueidis version are you using?
Besides WithContext
, what other methods are you using?
Hi @UncleVic, which rueidis version are you using?
v1.0.39
Besides
WithContext
, what other methods are you using?
I'm using ForceWithContext
and now I'm using TryWithContext
. They work quickly enough.
Now I upgraded the lib to v1.0.43. I have the same result.
Yes, this is a bug and I have found the root cause. I will release a fix soon. Thank you for reporting.
hello @rueian ~,
could you briefly talk about the root cause?
I occasionally could not acquire lock when using WithContext
with timeout 10 seconds, here is some information:
I am still trying to find the root cause based on some information such as metrics from the redis cluster and client code. Thank you.
Hi @UncleVic, @dangngoctam00
There was a race in the WithContext
which could lead to two issues:
A fix is out https://github.com/redis/rueidis/pull/604 and there is a pre-release https://github.com/redis/rueidis/releases/tag/v1.0.44-alpha.1 based on the fix.
Please let me know if the fix solves your problems.
hello @rueian , about item 2, do you mean monitoring
function? I just want to know more and explain the problem I'm facing.
Thank you very much.
Hi @dangngoctam00,
Yes, previously at the end of the monitoring function, we removed the watcher anyway if its reference count (g.w) decreased to 0. That could cause we missed some events from client side caching. The new implementation will not remove the watcher until either there is no WithContext or an acquired lock.
Hi @rueian , after checking your message and reading the code again, I have some thought and could you help me review it?
The client get error context deadline, based on the code at version v1.0.37
, this error is only returned from waitgate
function. There are 2 cases:
ch
of struct gate
in function waitgate
:
type gate struct {
ch chan struct{}
csc []chan struct{}
w int
}
but after checking the using of this channel at function waitgate
and monitor
, I think it's impossible.
waitgate
, I'm using redis cluster mode in AWS, but based on data, I still think it's impossible because the key of each lock request is unique, I could check it through kibana.Do you have any idea about it. Thank you.
Hi @dangngoctam00,
In the v1.0.37, WithContext
would return a deadline exceeded error only when the following line
https://github.com/redis/rueidis/blob/b514a567d88619c445c188248bae2515d166b0e4/rueidislock/lock.go#L212 was not triggered.
So, the problem was actually two questions:
waitgate
entered the select case
section in your scenario? The first waitgate
call shouldn't go into that section but other concurrent waitgate
calls should.<-g.ch
case not triggered? This must be a bug.Note that we can only fix the bug in the new version.
Hi @rueian , I will try to check if there is any concurrency based on your item 1.
Note that we can only fix the bug in the new version.
Yes, I understand, thank you.
Hi @dangngoctam00,
the original CPU highload issue has been solved by the #604. If you have new discoveries, please let me know in a new issue.
When I try to lock a context with
WithContext
I have an extremely high loading of CPU >100%I tried to refactor my code and replace
WithContext
withTryWithContext
and a ticker. Thereafter, the CPU load has dropped to 1%I create a locker with those parameters
I attempted to play with different parameters, but unfortunately haven't had a positive result :(
Redis 7.2.4