adoptium / aqa-systemtest

Java load testing and other full system application tests
Apache License 2.0
19 stars 65 forks source link

All JlmRemote* tests failed with same or similar issue: ` **FAILED** Process LT1 has ended unexpectedly` on aarch64_linux #459

Open sophia-guo opened 2 years ago

sophia-guo commented 2 years ago

All JlmRemote* tests failed with same or similar issue: Looks like intermittent https://ci.adoptopenjdk.net/job/Test_openjdk17_hs_sanity.system_aarch64_linux/78/console

Tests passed on same containers test-docker-ubuntu1604-armv8l-2 before https://trss.adoptium.net/output/test?id=613bbc4cc6182d021712771f

Reran 10x : https://ci.adoptopenjdk.net/view/work-in-progress/job/grinder_sandbox_new/321/, timeout. For finished 4 times. all passed.

19:59:01 CL1 j> 2021/09/15 07:59:00.844 Attempting to connect
19:59:02 STF 07:59:01.344 - FAILED Process LT1 has ended unexpectedly Client cannot connect to server.

19:59:01  STF 07:59:00.666 - +------ Step 3 - Wait for processes to complete
19:59:01  STF 07:59:00.666 - | Wait for processes to meet expectations
19:59:01  STF 07:59:00.666 - |   Processes: [LT1, CL1]
19:59:01  STF 07:59:00.666 - |
19:59:01  STF 07:59:00.666 - Monitoring processes: CL1 LT1
19:59:01  CL1 j> 2021/09/15 07:59:00.774 ServerURL=service:jmx:rmi:///jndi/rmi://localhost:1234/jmxrmi
19:59:01  CL1 j> 2021/09/15 07:59:00.844 Attempting to connect
19:59:02  STF 07:59:01.344 - **FAILED** Process LT1 has ended unexpectedly
19:59:02  STF 07:59:01.344 - Monitoring Report Summary:
19:59:02  STF 07:59:01.344 -   o Process CL1 is still running
19:59:02  STF 07:59:01.344 -   o Process LT1 has ended unexpectedly
19:59:02  STF 07:59:01.344 - Killing processes: CL1 LT1
19:59:02  STF 07:59:01.344 -   o Process CL1 pid 27940 stop()
19:59:11  STF 07:59:11.345 -   o Process CL1 pid 27940 terminate()
19:59:12  STF 07:59:12.346 -   o Process CL1 pid 27940 killed
19:59:12  STF 07:59:12.346 -   o Process LT1 pid 27939 is not running
19:59:12  **FAILED** at step 3 (Wait for processes to complete). Expected return value=0 Actual=1 at /home/jenkins/workspace/Test_openjdk17_hs_sanity.system_aarch64_linux/aqa-tests/TKG/../TKG/output_16316586434876/TestJlmRemoteClassNoAuth_0/20210915-075859-TestJlmRemoteClassNoAuth/execute.pl line 160| line 176
19:59:12  STF 07:59:12.350 - **FAILED** execute script failed. Expected return value=0 Actual=1

TestJlmRemoteClassAuth_0 => deep history 5/7 passed | possible issues TestJlmRemoteClassAuth_1 => deep history 5/7 passed | possible issues

TestJlmRemoteClassNoAuth_0 => deep history 5/7 passed | possible issues TestJlmRemoteClassNoAuth_1 => deep history 5/7 passed | possible issues TestJlmRemoteMemoryAuth_0 => deep history 5/7 passed | possible issues TestJlmRemoteMemoryAuth_1 => deep history 5/7 passed | possible issues TestJlmRemoteMemoryNoAuth_0 => deep history 5/7 passed | possible issues TestJlmRemoteMemoryNoAuth_1 => deep history 5/7 passed | possible issues TestJlmRemoteNotifierProxyAuth_1 => deep history 5/7 passed | possible issues TestJlmRemoteNotifierProxyAuth_0 => deep history 5/7 passed | possible issues TestJlmRemoteThreadAuth_1 => deep history 6/7 passed | possible issues TestJlmRemoteThreadAuth_0 => deep history 6/7 passed | possible issues TestJlmRemoteThreadNoAuth_1 => deep history 6/7 passed | possible issues TestJlmRemoteThreadNoAuth_0 => deep history 6/7 passed | possible issues

https://github.com/adoptium/aqa-tests/issues/2854

smlambert commented 8 months ago

In Deep History view, it appears to happen consistently on test-docker-centos8-armv8-1 but passes on other machines.

Screenshot 2024-01-08 at 8 25 46 AM
smlambert commented 8 months ago

@Haroon-Khel - not sure if test-docker-centos8-armv8-1 is listed as a 'problem machine', but this problem of JlmRemote* tests failing on it appears to be consistent and reproducible.

Related: https://github.com/adoptium/infrastructure/issues/2662

Haroon-Khel commented 8 months ago

I reran the tests with 5 iterations on test-docker-centos8-armv8-1. I am seeing mostly passes. Only one failure of TestJlmRemoteMemoryAuth_0 in one of the iterations. https://ci.adoptium.net/job/Grinder/8448/console