Open ShaheedHaque opened 4 years ago
Thanks for the report. I will look into this as soon as my IT folks get my development machine repaired. What is the value of “isJVMStarted()”? Does adding an if statement for it (in Python or in pyjp_module) fix the issue?
First, I have noticed that the SIGSEGV is not easy to reproduce, happening as rarely as perhaps 1 time in 200.
Second, I've added a print() of “isJVMStarted()”, and not seen the failure after a handful of runs.
Will report back after gathering more data.
OK, it failed just now even with the “isJVMStarted()” in place on my Ubuntu setup:
Fatal Python error: Segmentation fault
Thread 0x00007fbca3605740 (most recent call first):
File "/usr/local/lib/python3.8/dist-packages/jpype/_core.py", line 322 in _JTerminate <<< line number changed because of inserted isJVMStarted().
Again, it must have run without issue several hundred times before this point.
Interestingly, my MacOS-based colleage is regularly seeing what we think is the same issue, and he was able to extract a crash log: hs_err_pid53983.log. This repro is from the cycling of Celery workers, with the SIGSEGV at process exit (or at least we assume so, since it has no discernible effect on the operation of the system). Note: he does not have the inserted isJVMStarted().
And here is a curious thing...I just ran myusual test script, and it seemed to exit twice, like this:
...
========== 3 failed, 362 passed, 1261 warnings in 6067.84s (1:41:07) ===========
isJVMStarted=================== True
isJVMStarted=================== False
So, _JTerminate() was called twice, and once thought the JVM started, and once not.
Any chance you can get this to replicate on a reduced version of the code? It sill seems like something in the JVM is crashing. So either we are creating the JVM twice as a result of a fork or a terminate and restart. The guard code is supposed to prevent starting twice, but perhaps if you can replicate a miss and fix it we can finally resolve this.
I tried before, but there is a lot of stuff, and when I trimmed too far it stopped failing. Now that we have a slightly different problem, I'll try once again. I'll report back with any results.
What are the ramifications of doing?
def _JTerminate():
try:
if _jpype.isStarted():
_jpype.shutdown()
except RuntimeError:
pass
It should make sure that Java closes properly when Python exits. Before we did not guarantee that Java files were properly closed nor that Java threads had terminated. If you call shutdown manually then isStarted will be false and it will just operate as normal. If Python exits without closing Java we perform the Java shutdown first. If there are non-daemon threads then it will wait for them to terminate.
What are the ramifications of doing?
def _JTerminate(): try: if _jpype.isStarted(): _jpype.shutdown() except RuntimeError: pass
Based on my experiment adding calls to isJVMStarted()
, it is not clear to me that would make any difference, because the SEGV can occur even when the test returns False
as in this example:
=========================== short test summary info ============================
FAILED test/test_suite74gb_franecki.py::TestPeoplesPension::test_100_complete_use_cases[SubmitEnrolmentsAndContributions_]
FAILED test/test_suite90_live.py::TestLiveA::test_400_check_log_files____ - A...
========== 2 failed, 363 passed, 1242 warnings in 6168.39s (1:42:48) ===========
Fatal Python error: Segmentation fault
Thread 0x00007f6260ee6740 (most recent call first):
File "/usr/local/lib/python3.8/dist-packages/jpype/_core.py", line 322 in _JTerminate
isJVMStarted=================== False
Can you look over #937 to see if an option fixes this issue?
Hi. I am using JPype 1.2.0 and have been seeing this issue for a while. The Jenkins build or local run has intermittent failures with below failure. Any suggestion to resolve this issue is appreciated:
2021-05-06 16:32:36.041
2021-05-06 16:32:36.041 Thread 0x00007fee56e3a100 (most recent call first):
2021-05-06 16:32:36.041 File "/opt/app-root/lib64/python3.8/site-packages/jpype/_core.py", line 340 in _JTerminate
2021-05-06 16:32:36.041 #
2021-05-06 16:32:36.041 # A fatal error has been detected by the Java Runtime Environment:
2021-05-06 16:32:36.042 #
2021-05-06 16:32:36.042 # SIGSEGV (0xb) at pc=0x00007fee55d6a9bf (sent by kill), pid=270, tid=427
2021-05-06 16:32:36.042 #
2021-05-06 16:32:36.042 # JRE version: OpenJDK Runtime Environment 18.9 (11.0.9.1+1) (build 11.0.9.1+1-LTS)
2021-05-06 16:32:36.042 # Java VM: OpenJDK 64-Bit Server VM 18.9 (11.0.9.1+1-LTS, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
2021-05-06 16:32:36.042 # Problematic frame:
2021-05-06 16:32:36.042 # C [libpthread.so.0+0x129bf] raise+0x10f
2021-05-06 16:32:36.042 #
2021-05-06 16:32:36.042 # Core dump will be written. Default location: /home/jenkins/workspace/ne-learning_concord-mono_IS-2243/e2e/tests/step_defs/core.270
2021-05-06 16:32:36.042 #
2021-05-06 16:32:36.042 # An error report file with more information is saved as:
2021-05-06 16:32:36.042 # /home/jenkins/workspace/ne-learning_concord-mono_IS-2243/e2e/tests/step_defs/hs_err_pid270.log
2021-05-06 16:32:36.042 #
2021-05-06 16:32:36.042 # If you would like to submit a bug report, please visit:
2021-05-06 16:32:36.042 # https://bugzilla.redhat.com/enter_bug.cgi?product=Red%20Hat%20Enterprise%20Linux%208&component=java-11-openjdk
2021-05-06 16:32:36.042 #
2021-05-06 16:32:36.042 Fatal Python error: Aborted
2021-05-06 16:32:36.042
2021-05-06 16:32:36.042 Thread 0x00007fee56e3a100 (most recent call first):
2021-05-06 16:32:36.042 File "/opt/app-root/lib64/python3.8/site-packages/jpype/_core.py", line 340 in _JTerminate
2021-05-06 16:32:36.042 /home/jenkins/workspace/ne-learning_concord-mono_IS-2243@tmp/durable-0824d76d/script.sh: line 54: 270 Aborted ```
Could it be related to what you wrote here in your documenation?
https://jpype.readthedocs.io/en/latest/userguide.html#errors-reported-by-python-fault-handler
It seems like using the -p no:faulthandler
switch on pytest might help avoid these errors.
As per #720, to get 1.0.2 working for us, I moved JPype initialisation to be delayed it until actually needed. On process exit, it seems that if the initialisation code is not actually called, the process exists with a SIGSEGV:
And here is line 321:
The only reference to jpype that this program could have made is an import as part of its transitive fanout:
and it won't have invoked startJVM().