Open HoneyryderChuck opened 4 years ago
Thank you for filing an issue.
Relevant output:
$ spec/support/specs.sh run
...
with cellulloid
behaves like an snmp client
#get
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received
I've tracked this down to this monitor.closed?
which hangs and never calls the underlying C extension method: https://github.com/celluloid/celluloid-io/blob/49c1021b435073cd7bc9a394658d273488f4105e/lib/celluloid/io/reactor.rb#L52
I would suspect an issue relating to the mutex here:
https://github.com/oracle/truffleruby/blob/3b756ceddba635edb4a5673ab999a26f427074de/lib/truffle/truffle/cext_ruby.rb#L37
The specs appear to pass when CEXT_LOCK
is set to false.
@bjfish Interesting, could you show a Java thread dump (from Ctrl+) or Ruby thread dump (send SIGALRM to the process) when that happens? Probably something else is keeping the C-ext lock and never releasing it.
@eregon Java thread dump: https://gist.github.com/bjfish/ab5b30e7a29c009d46a5db0d889d9df1 Ruby thread dump: https://gist.github.com/bjfish/8d4f1441f561038cef02c8fed9da876d
OK that shows:
"Ruby Fiber id=14 from /Users/brandonfish/Documents/oracle-workspace/gem_home/gems/celluloid-0.17.4/lib/celluloid/task/fibered.rb:9" #14 prio=5 os_prio=31 tid=0x00007f8a2c5a8800 nid=0x3f03 waiting on condition [0x000070000a0f0000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006c07b3678> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireInterruptibly(AbstractQueuedSynchronizer.java:897)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1222)
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
at org.truffleruby.core.mutex.MutexOperations.lambda$lockInternal$0(MutexOperations.java:40)
at org.truffleruby.core.mutex.MutexOperations$$Lambda$806/1719946447.block(Unknown Source)
at org.truffleruby.core.thread.ThreadManager.runUntilResult(ThreadManager.java:482)
at org.truffleruby.core.mutex.MutexOperations.lockInternal(MutexOperations.java:39)
at org.truffleruby.cext.CExtNodes$CallCWithMutexNode.callCWithMutex(CExtNodes.java:135)
is waiting for the C-ext lock, but this fiber:
"Ruby Thread id=16 from /Users/brandonfish/Documents/oracle-workspace/gem_home/gems/celluloid-0.17.4/lib/celluloid/group/spawner.rb:47" #16 prio=5 os_prio=31 tid=0x00007f8a29cec800 nid=0x3d07 waiting on condition [0x0000700009ff2000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007837cd1b8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at org.truffleruby.core.fiber.FiberManager.lambda$waitForResume$3(FiberManager.java:236)
at org.truffleruby.core.fiber.FiberManager$$Lambda$809/345437432.block(Unknown Source)
at org.truffleruby.core.thread.ThreadManager.runUntilResultKeepStatus(ThreadManager.java:450)
at org.truffleruby.core.fiber.FiberManager.waitForResume(FiberManager.java:234)
...
Locked ownable synchronizers:
- <0x00000007838c19f0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
- <0x00000006c07b3678> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
holds the C-ext locks and is waiting for the next Fiber#resume.
We need to figure out why it's still holding the C-ext lock (is there some C-ext code ending up in Fiber.yield
?) and if it should have released it before.
If it's actually a C-ext calling (can be indirectly) Fiber.yield
then we need to figure where is a good place to release the C-ext lock.
https://travis-ci.org/swisscom/ruby-netsnmp/jobs/617962889?utm_medium=notification&utm_source=github_status
@eregon follow up of #1730