Closed mlartz closed 6 years ago
Looking at your example in https://github.com/stompgem/stomp/issues/142#issuecomment-314870902, if you remove the require 'logging'
line then you no longer see deadlocks? Is that correct?
Exactly. Removing the require 'logging'
line causes the script to not deadlock, however there are some other issues, as noted by the author of the STOMP gem. This obviously calls logging
into question, but the jstack trace seems to indicate that its some interaction with Jruby, SSLSocket, and IO when both reading and writing to the TLS "socket". Although it gets a little confusing, because somehow the threads/processes get labeled with "diagnotic_context.rb".
The only interesting thing the logging framework is doing with threads is in relation to the diagnostic contexts. The goal of that chunk of code is to capture the diagnostic context from the parent thread when a new thread is spawned. The only way to figure out the parent thread is to capture the information when a new thread is created, so that is why new
, start
, and fork
are being monkey patched.
This behavior can be overridden via an environment variable. Would you be willing to run an experiment with your deadlock script and run it with the require 'logging'
line present and with the LOGGING_INHERIT_CONTEXT=false
environment variable set? I'd like to see if that might be contributing to the deadlock issues you are seeing.
(Un?)Fortunately, I found no difference when running with or without the environment variable.
LOGGING_INHERIT_CONTEXT=false jruby -J-Xmx2048m -S bundle exec ruby bin/deadlock_test
Other than threading, does logging do any monkeypatching with IO objects?
(Un?)Fortunately
Ha Ha Ha Ha! :joy:
That made me laugh - thank you :bowing_man:
The sad news here is that the logging framework is not doing any crazy initialization or anything related to sockets or IO. You have to instantiate appenders and loggers before any actions are taken.
Here is the list of files required by the logging framework:
fileutils
little-plugger
multi_json
rbconfig
set
singleton
socket
stringio
syslog
thread
yaml
One debugging option here is to replace the require 'logging'
with a bunch of require lines for each of the libraries listed above. Remove the require statements one by one to see if one of them might contributing to the issue. Other than that I'm pretty stumped.
Is there any output about which thread is deadlocking and where?
Notes from the stomp gem lead:
I cannot recreate a deadlock (or block/hang) using this test code:
https://gist.github.com/gmallard/f7c709bc26d5a16b6db6ebeeea20f18a
Here is a short test log:
https://gist.github.com/gmallard/1641573d0e040ba2dda4328e08e3071a
Investigation of the issue reported to me uncovered another bug - but this one will not recreate.
For reference:
Thanks for the debug tips. I was able to replicate a deadlock without any of gem imports, and was later able to replicate the deadlock without logging
required, however it took much longer for the deadlock to occur (20 seconds vs 20 minutes). So at this point it seems like logging
is not involved, sorry for the non-bug report. Everything is OK!
@mlartz thank you so much for the follow up on this issue. Good luck with the deadlock debugging! :+1:
I've been having some issues with porting my STOMP consumer to JRuby and am currently getting a deadlock (ref https://github.com/stompgem/stomp/issues/142) when I attemp to
require 'logging'
.The STOMP gem uses a thread to receive STOMP messages via the network over TLS, and I use a thread pool to process/ack that data to the STOMP client. I've pared the offending code down to a very small consumer, which doesn't replicate the issue (although it might have other issues).
That said, when I
require 'logging'
in the stripped down consumer (I use logging in my full application), the demo script deadlocks after it receives/acks a few messages. Unfortunately, I've tried to replicate the issue without the STOMP gem (with just threads/thread pool) but have been unable to. So this issue might pertain to any one of JRuby, JRuby-SSL, STOMP (gem), or logging (gem). Mostly, I'm just wondering if 'logging' does anything tricksy on load that might have an effect on threading/locking/IO/SSL under JRuby?I'm using jruby 9.0.3.0 (2.2.2) 2015-10-21 633c9aa Java HotSpot(TM) 64-Bit Server VM 25.45-b02 on 1.8.0_45-b14 +jit [linux-amd64]