Closed Abdullah-l closed 3 months ago
Hi @Abdullah-l - I'm not able to replicate this; can you provide a repository/code sample of this occurring?
Further illustrated no apparent issue here: https://github.com/bigcommerce/gruf/pull/207
Hey @splittingred, I managed to reproduce it in your rake task :)
I created two branches with the exact same changes
The issue is easily seen in the logs Before 2.17.0
I, [2024-02-23T02:24:05.015640 #20703] INFO -- : Received get1 at 2024-02-23 02:24:05 +0300
I, [2024-02-23T02:24:06.221235 #20703] INFO -- : Received get2 at 2024-02-23 02:24:06 +0300
I, [2024-02-23T02:24:06.325923 #20703] INFO -- : Received get3 at 2024-02-23 02:24:06 +0300
After 2.17.0
I, [2024-02-23T02:25:00.553379 #21885] INFO -- : Received get1 at 2024-02-23 02:25:00 +0300
I, [2024-02-23T02:25:30.554698 #21885] INFO -- : Received get2 at 2024-02-23 02:25:30 +0300
I, [2024-02-23T02:25:30.554768 #21885] INFO -- : Received get3 at 2024-02-23 02:25:30 +0300
As you can see after 2.17.0 get2
and get3
requests are blocked for a while there. They arrived 30 seconds later (after get1
timed out).
@Abdullah-l can you provide more information about your working environment:
Ruby version - I tried 2.7, 3.0, 3.1 Rubygems version - 3.3.3 gRPC version installed - 1.61
I'm like 99.99% sure that 2.17.0 broke it
I hope you can reproduce it :)
You can just checkout the two branches I linked in the previous message and run:
./spec/demo_server
bundle exec rake gruf:demo:concurrent
on each one and see the logs
And to clarify, is this under development mode or production mode? (What does Gruf.development?
say?)
Basically the new code uses a read-write lock; any number of requests can concurrently "read", that is, access the controller - but when the controller is reloaded the lock is "write" locked and no concurrency is allowed.
I'd add some logging statements to see the order of operations around reload, the write lock acquire/release and the read lock acquire release.
Using the rake task @splittingred wrote, Gruf.development?
is true
However the issue still occurs when I used it with rails on a staging environment.
@leonid-shevtsov regarding adding logs, I didn't fully understand where to add them and what are we looking for 😅 is it here?
is it here?
yeah, here and also in lines 54-56; before the block (locking
), at the beginning of the block (locked
), and after it (unlocked
); normally, you should not see a very long write locked
period (that would be breaking concurrency indeed)
hmm... perhaps what's happening is the first request locks it for reading, and then the second wants to lock it for writing and hence has to wait until the first one completes? but this should only happen in a development environment.
could it be possible that you're running it in dev mode on staging? dev mode enables code reloading and possibly other things that you generally don't need or want outside of a dev environment.
and looks like the fallback environment is development
:
Closing because I gave up on this 😅
Please describe the issue
Hey guys @splittingred @leonid-shevtsov, check this out: Create two RPCs
Place a log in both RPCs Call RPC 1 then 2
On gruf 2.16.2 it behaves normally by printing both logs
On gruf 2.17.0 it does not print the log in RPC 2. The server does not acknowledge the call at all, the client keeps loading indefinitely unless I cancel RPC 1.
This blocking behaviour I suspect was introduced in #178 as it's the only PR in that release that touches threading.
The issue is reproducible on the latest version too (2.19.0)
Anything else we should know?