Open sant0s opened 5 years ago
If true, this could defeat the purpose of await
somewhat. That said, the 2nd and 3rd requests are apparently running concurrently; perhaps the first web request handling thread is incorrectly doing dual duty as both the request thread dispatching and as a request thread handler itself. If so, that could lead to every one request in N requests blocking all subsequent requests until it's complete.
I'll dig into the code to investigate further. I might be guessing the completely wrong thing.
Any luck?
I'm unable to reproduce the issue with
Am I doing something wrong? I'm opening "simultaneously" that 3 tabs by quickly middle clicking on the link in this issue three times.
~ _ _
~ _ __ | | __ _ _ _| |
~ | '_ \| |/ _' | || |_|
~ | __/|_|\____|\__ (_)
~ |_| |__/
~
~ play! 1.5.2, https://www.playframework.com
~
~
~ Warning: conflict on command doc:export
~
~ Ctrl+C to stop
~
~ using java version "1.8.0_212"
Listening for transport dt_socket at address: 8000
.
.
.
12:38:52,075 INFO ~ [play-thread-1] Application 'play-issue-1304-blocked-pool-threads' is now started !
12:38:52,129 INFO ~ [play-thread-1] before await
12:38:52,129 INFO ~ [pool-5-thread-1] before sleep
12:38:52,129 DEBUG ~ [play-thread-1] suspend()
12:38:52,130 DEBUG ~ [play-thread-1] push object java.util.concurrent.FutureTask@1037802910/unknown [java.util.concurrent.FutureTask@3ddb9d9e] i[0],l[0],d[0],f[0],o[0],r[0]
12:38:52,130 DEBUG ~ [play-thread-1] push int 2 i[0],l[0],d[0],f[0],o[1],r[0]
12:38:52,133 WARN ~ [play-thread-1] Start WaitForTasksCompletion
12:38:52,135 INFO ~ [play-thread-1] before await
12:38:52,135 INFO ~ [pool-5-thread-2] before sleep
12:38:52,135 DEBUG ~ [play-thread-1] suspend()
12:38:52,135 DEBUG ~ [play-thread-1] push object java.util.concurrent.FutureTask@1190650721/unknown [java.util.concurrent.FutureTask@46f7e361] i[0],l[0],d[0],f[0],o[0],r[0]
12:38:52,135 DEBUG ~ [play-thread-1] push int 2 i[0],l[0],d[0],f[0],o[1],r[0]
12:38:52,216 INFO ~ [play-thread-1] before await
12:38:52,216 INFO ~ [pool-5-thread-3] before sleep
12:38:52,216 DEBUG ~ [play-thread-1] suspend()
12:38:52,216 DEBUG ~ [play-thread-1] push object java.util.concurrent.FutureTask@114452413/unknown [java.util.concurrent.FutureTask@6d267bd] i[0],l[0],d[0],f[0],o[0],r[0]
12:38:52,216 DEBUG ~ [play-thread-1] push int 2 i[0],l[0],d[0],f[0],o[1],r[0]
12:39:12,129 INFO ~ [pool-5-thread-1] after sleep
12:39:12,135 INFO ~ [pool-5-thread-2] after sleep
12:39:12,163 DEBUG ~ [play-thread-1] pop int 2 i[0],l[0],d[0],f[0],o[1],r[0]
12:39:12,163 DEBUG ~ [play-thread-1] pop object java.util.concurrent.FutureTask@1037802910/unknown [java.util.concurrent.FutureTask@3ddb9d9e]
12:39:12,171 INFO ~ [play-thread-1] after await
12:39:12,216 INFO ~ [pool-5-thread-3] after sleep
12:39:12,518 DEBUG ~ [play-thread-1] pop int 2 i[0],l[0],d[0],f[0],o[1],r[0]
12:39:12,518 DEBUG ~ [play-thread-1] pop object java.util.concurrent.FutureTask@1190650721/unknown [java.util.concurrent.FutureTask@46f7e361]
12:39:12,518 INFO ~ [play-thread-1] after await
12:39:12,526 DEBUG ~ [play-thread-1] pop int 2 i[0],l[0],d[0],f[0],o[1],r[0]
12:39:12,526 DEBUG ~ [play-thread-1] pop object java.util.concurrent.FutureTask@114452413/unknown [java.util.concurrent.FutureTask@6d267bd]
12:39:12,526 INFO ~ [play-thread-1] after await
The empty line is where the 20 seconds wait happens.
@xabolcs You're doing nothing wrong and I am glad that Play works as expected, as this would be a major issue (literally a "blocker" as threads get blocked :)).
Your set up (Java and Play versions) is similar to mine, so I started wondering what else could be different. I re-tested and got the following results:
Scenario A: same behaviour as I reported i.e. it doesn't work Scenario B: same behaviour as you reported i.e. it works
Can you guess what's the difference? :)
Scenario A corresponds to a VirtualBox guest, while Scenario B corresponds to a plain non-virtualised machine (as I assume you tested on).
Any idea about the underlying reason for this mismatching behaviour? It might be related to thread scheduling, but no clue on details.
So these are good news for Play apps deployed in physical machines, but an issue for those deployed in virtualised servers (in production) as they may stall.
Nice findings, @sant0s ! Yes, I was testing on bare metal hardware.
How many processors does have that VirtualBox guest? In other words: is the bug related to processor count or to virtualization?
Sorry, I didn't find the time to look too deeply into this as I originally promised. I was having difficulty reproducing, it's good to see I'm not the only one.
I can't see how more CPUs on a physical machine over a VM would change the problem (although, like most concurrency bugs, it can certainly exacerbate it.) This bug seems to be more related to concurrency and not parallelism.
If the pool is large and there are only three tabs open, is it possible that Play just doesn't allocate the main web request dispatch thread by sheer fluke? That seems pretty unlikely if you've refreshed the tabs enough though, so it seems more likely my guess is completely wrong...
@xabolcs 2 CPUs. Both scenarios - A (VirtualBox guest) and B (host, bare metal) - are run under 2 CPUs each (the host has 4 CPUs). I'd guess virtualisation related.
@LouisJackman No problem.
I wonder if Play 2 is affected by this issue.
References
Discuss Play Forum: https://discuss.lightbend.com/t/continuations-partially-blocked-pool-threads/3752 Lighthouse (copy of this ticket): https://play.lighthouseapp.com/projects/57987/tickets/2390-continuations-partially-blocked-pool-threads
Description
Play! version: 1.5.2 Java version: 1.8.0_201 Ubuntu 18.04.2 LTS
Consider a Play app consisting of a controller that makes use of continuations in order to allow simultaneous requests to be run by pool threads.
Create a test app
play new test
and change the application controller to
Then add conf/log4j.properties,
start play
play run
and simultaneously open 3 browser tabs at http://localhost:9000/
The results are:
Logs show that:
Expected Behavior
Since the 3 tabs were opened at (around) the same time, they should all be loaded after (around) 20 seconds i.e. the 3 pool threads should start/complete handling them at (around) the same time. Furthermore, play-thread-1 should not be blocked.
Actual Behavior
All pool threads should start/complete handling simultaneous requests at (around) the same time and play-thread-1 should not block i.e. it should handle other requests while waiting for pool threads to complete.
Reproducible Test Case
The actual behaviour is consistently reproducible by following the above-mentioned steps.