Open AdamBrousseau opened 4 years ago
I expect the following call is failing https://github.com/eclipse/omr/blob/a960b4f24a3c9372d654579109a6b32954e182a2/port/win32/omrsysinfo.c#L785-L789
Does the machine have any swap space configured?
If I'm looking in the right spot, it would appear so.
Internal machine #2
(Win2016)
MyPC->Properties->Advanced system settings->Advanced (tab)->(Performance) Settings->Advanced (tab)->Change
vs external machine #3
(Win2012R2)
Reran internally on a 2012r2 machine and it passed. So it seems to be an issue with win2016. Adopt has both 2012 and 2016 but I don't believe they regularly run the functional suite. I also don't have permissions to run grinders there.
Do we have win2016 in the farm? We could try the test there and double check it's working on farm machines.
i.e. job_output.php?id=620517467
Seems to work fine on 10, 2012, and 2016 on the legacy farm. build_id=432218
Fails at Adopt on 2012 for the same error https://ci.adoptopenjdk.net/job/Grinder/1039/ But passes at Adopt on 2016 https://ci.adoptopenjdk.net/job/Grinder/1040/ So it seems to be more of a machine config issue rather than a Windows version issue.
import java.lang.management.*;
public class Test {
public static void main(String[] args) throws Throwable {
com.ibm.lang.management.OperatingSystemMXBean osbean = (com.ibm.lang.management.OperatingSystemMXBean)ManagementFactory.getOperatingSystemMXBean();
System.out.println(osbean.getTotalSwapSpaceSize());
}
}
bin/java -Xtrace:print=omrport.585-590 Test
19:21:27.253*0x22158700 omrport.585 > omrsysinfo_get_memory_info: Function entered.
19:21:27.253 0x22158700 omrport.589 * - omrsysinfo_get_memory_info: Error: Failed collecting data.
19:21:27.253 0x22158700 omrport.590 < omrsysinfo_get_memory_info: Return code = -358.
19:21:27.409*0x22158700 omrport.585 > omrsysinfo_get_memory_info: Function entered.
19:21:27.409 0x22158700 omrport.589 * - omrsysinfo_get_memory_info: Error: Failed collecting data.
19:21:27.409 0x22158700 omrport.590 < omrsysinfo_get_memory_info: Return code = -358.
omrport.589 1 1 1 N Trc_PRT_sysinfo_get_memory_info_dataQueryFailed "* omrsysinfo_get_memory_info: Error: Failed collecting data."
https://github.com/eclipse/omr/blob/a960b4f24a3c9372d654579109a6b32954e182a2/port/win32/omrsysinfo.c#L777-L782
The error code -358 comes from the OMR port library, so isn't helpful. Looking at https://docs.microsoft.com/en-us/windows/win32/api/pdh/nf-pdh-pdhcollectquerydata there are two possible return codes, PDH_INVALID_HANDLE and PDH_NO_DATA. I'm assuming it's not PDH_INVALID_HANDLE because the code works on other machines. PDH_NO_DATA shows: "The query may not contain data because the user is not running with an elevated token (see Limited User Access Support). "
Add user to Performance Log Users
group.
Should this be documented in the OpenJ9 functional test requirements? cc @smlambert
Seems like it is something to be captured in ansible playbooks for x32 win machines and if I understand it correctly it would be a problem for any user who calls getTotalSwapSpaceSize() on a x32 windows 2012 machine?
Sorry, was getting this crossed with #7717 ignore my comment.
Copied from https://github.com/eclipse/openj9/issues/7717#issuecomment-558718727
Root cause:
Newer version of OpenSSH (7.9->8.0 see changelog [1], Portability section), which is installed during machine & Cygwin setup, has changes to the setup which further restrict the configuration and resulting default permissions. Specifically, the process now runs as SYSTEM
user instead of an added cyg_server
user (see now outdated setup instructions in [2]). Which has the effect of cascading less permissions down to the jenkins
user when logged in via cygwin/ssh. Jenkins no longer has (by default) access to run WMI/WMIC or query the system for swap space (memory info). The "fix" is to give more permissions to the jenkins user.
To resolve this issue (#7716)
getTotalSwapSpaceSize() API returned -1
Performance Log Users
group.
See https://github.com/AdoptOpenJDK/openjdk-infrastructure/pull/1009To resolve #7717
getPID() returned Access Deined
The following settings are required in order to allow jenkins remote access to WMI:
Setting DCOM Security to Allow a User to Access a Computer Remotely
Component Services
or dcomcnfg
Setting Namespace Security with the WMI Control
* sshd(8): On Cygwin, run as SYSTEM where possible, using S4U for
token creation if it supports MsV1_0 S4U Logon.
My follow up question, is this something that should be documented at OMR since OMR testing will fail if these settings aren't applied? Also, if a thirdparty app tries to use the omrsysinfo_get_memory_info function on win32 it will fail if they are running via Cygwin-ssh without adding these extra perms.
This is also occurring at AdoptOpenJDK, but on win64_xl (the test is skipped on win64 because it specifies -Xnocompressedrefs
) - e.g. https://ci.adoptopenjdk.net/job/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/63/. It does not fail on all test machines - see https://github.com/AdoptOpenJDK/openjdk-infrastructure/issues/1805
17:55:47 openjdk version "1.8.0_282-internal"
17:55:47 OpenJDK Runtime Environment (build 1.8.0_282-internal-202101051709-b07)
17:55:47 Eclipse OpenJ9 VM (build master-1910cfa3a, JRE 1.8.0 Windows Server 2012 R2 amd64-64-Bit 20210105_550 (JIT enabled, AOT enabled)
17:55:47 OpenJ9 - 1910cfa3a
17:55:47 OMR - a9b64bdc8
17:55:47 JCL - 722ab284 based on jdk8u282-b07)
19:46:46 ===============================================
19:46:46 Running test testOSMXBeanLocal_0 ...
19:46:46 ===============================================
19:46:46 testOSMXBeanLocal_0 Start Time: Tue Jan 5 19:46:41 2021 Epoch Time (ms): 1609876001610
19:46:46 "C:/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdkbinary/j2sdk-image\\bin\\java" -Xshareclasses:destroyAll; "C:/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdkbinary/j2sdk-image\\bin\\java" -Xshareclasses:groupAccess,destroyAll; echo "cache cleanup done";
19:46:46 JVMSHRC005I No shared class caches available
19:46:46 JVMSHRC005I No shared class caches available
19:46:46 cache cleanup done
19:46:46 variation: Mode100
19:46:46 JVM_OPTIONS: -Xint -Xgcpolicy:optthruput -Xnocompressedrefs
19:46:46 { itercnt=1; \
19:46:46 mkdir -p "C:/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdk-tests/\\TKG\\test_output_16098696244505\\testOSMXBeanLocal_0"; \
19:46:46 cd "C:/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdk-tests/\\TKG\\test_output_16098696244505\\testOSMXBeanLocal_0"; \
19:46:46 export PATH="/cygdrive/c/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdkbinary/openjdk-test-image/openj9:/cygdrive/c/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdkbinary/j2sdk-image/jre/bin/default:/cygdrive/c/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdkbinary/j2sdk-image/jre/bin/j9vm:/cygdrive/c/Program Files (x86)/Common Files/Oracle/Java/javapath:/usr/bin:/cygdrive/c/Windows/system32:/cygdrive/c/Windows:/cygdrive/c/Windows/System32/Wbem:/cygdrive/c/Windows/System32/WindowsPowerShell/v1.0:/cygdrive/c/Strawberry/perl/bin:/cygdrive/c/Program Files/Git/cmd:/cygdrive/c/Program Files/Java/jdk8u172-b11/bin:/cygdrive/c/apache-ant/apache-ant-1.10.3/bin:/cygdrive/c/Program Files (x86)/Windows Kits/8.1/Windows Performance Toolkit:/cygdrive/c/Program Files/Microsoft SQL Server/110/Tools/Binn:/cygdrive/c/Program Files (x86)/Microsoft SDKs/TypeScript/1.0:/cygdrive/c/Program Files/Microsoft SQL Server/120/Tools/Binn"; \
19:46:46 "C:/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdkbinary/j2sdk-image\\bin\\java" -Xint -Xgcpolicy:optthruput -Xnocompressedrefs \
19:46:46 -DRUNLOCAL=true \
19:46:46 -cp "C:/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdk-tests///..//jvmtest\\TestConfig\\resources;C:/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdk-tests/\\TKG\\lib\\testng.jar;C:/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdk-tests/\\TKG\\lib\\jcommander.jar;C:/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdk-tests///..//jvmtest\\functional\\JLM_Tests\\jlm_tests.jar" \
19:46:46 org.testng.TestNG -d "C:/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdk-tests/\\TKG\\test_output_16098696244505\\testOSMXBeanLocal_0" "C:/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdk-tests///..//jvmtest\\functional\\JLM_Tests\\testng.xml" \
19:46:46 -testnames testOSMXBean \
19:46:46 -groups level.extended \
19:46:46 -excludegroups d.*.win_x86-64,d.*.arch.x86,d.*.os.win,d.*.bits.64,d.*.generic-all; \
19:46:46 if [ $? -eq 0 ] ; then echo ""; echo "testOSMXBeanLocal_0""_PASSED"; echo ""; cd C:/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdk-tests/; else echo ""; echo "testOSMXBeanLocal_0""_FAILED"; echo ""; fi; } 2>&1 | tee -a "C:/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdk-tests/\\TKG\\test_output_16098696244505\\TestTargetResult";
19:46:46 [IncludeExcludeTestAnnotationTransformer] [INFO] exclude file is C:/Users/jenkins/workspace/Test_openjdk8_j9_extended.functional_x86-64_windows_xl/openjdk-tests/\\TestConfig\\resources\\excludes\\latest_exclude_8.txt
19:46:46 ...
19:46:46 ... TestNG 6.14.2 by Cédric Beust (cedric@beust.com)
19:46:46 ...
19:46:46
19:46:46 [TestOperatingSystemMXBean] [INFO] ----------------------------------------------------------
19:46:46 [TestOperatingSystemMXBean] [INFO] Starting the OperatingSystemMXBean usage API tests........
19:46:46 [TestOperatingSystemMXBean] [INFO] ----------------------------------------------------------
19:46:46 [TestOperatingSystemMXBean] [INFO] Tests starting at: 1609876002323 ms.
19:46:46 [TestOperatingSystemMXBean] [INFO] Testing getTotalPhysicalMemory() & getFreePhysicalMemorySize() APIs
19:46:46 [TestOperatingSystemMXBean] [INFO] Testing getTotalSwapSpaceSize() & getFreeSwapSpaceSize() APIs
19:46:46 [TestOperatingSystemMXBean] [INFO] Tests finished at: 1609876002328 ms.
19:46:46 FAILED: runTestOSMXBean
19:46:46 java.lang.AssertionError: Error: getTotalSwapSpaceSize() API returned -1, test failed
19:46:46 at org.testng.Assert.fail(Assert.java:96)
19:46:46 at org.openj9.test.management.TestOperatingSystemMXBean.test_SwapSpaceSizeAPIs(TestOperatingSystemMXBean.java:227)
19:46:46 at org.openj9.test.management.TestOperatingSystemMXBean.runTestOSMXBean(TestOperatingSystemMXBean.java:171)
19:46:46 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
19:46:46 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
19:46:46 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
19:46:46 at java.lang.reflect.Method.invoke(Method.java:498)
19:46:46 at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:124)
19:46:46 at org.testng.internal.Invoker.invokeMethod(Invoker.java:580)
19:46:46 at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:716)
19:46:46 at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:988)
19:46:46 at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:125)
19:46:46 at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109)
19:46:46 at org.testng.TestRunner.privateRun(TestRunner.java:648)
19:46:46 at org.testng.TestRunner.run(TestRunner.java:505)
19:46:46 at org.testng.SuiteRunner.runTest(SuiteRunner.java:455)
19:46:46 at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:450)
19:46:46 at org.testng.SuiteRunner.privateRun(SuiteRunner.java:415)
19:46:46 at org.testng.SuiteRunner.run(SuiteRunner.java:364)
19:46:46 at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
19:46:46 at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:84)
19:46:46 at org.testng.TestNG.runSuitesSequentially(TestNG.java:1208)
19:46:46 at org.testng.TestNG.runSuitesLocally(TestNG.java:1137)
19:46:46 at org.testng.TestNG.runSuites(TestNG.java:1049)
19:46:46 at org.testng.TestNG.run(TestNG.java:1017)
19:46:46 at org.testng.TestNG.privateMain(TestNG.java:1354)
19:46:46 at org.testng.TestNG.main(TestNG.java:1323)
19:46:46
19:46:46
19:46:46 ===============================================
19:46:46 testOSMXBean
19:46:46 Tests run: 1, Failures: 1, Skips: 0
19:46:46 ===============================================
@lumpfish see https://github.com/eclipse/openj9/issues/7716#issuecomment-558719129 which describes how to fix the machine. i.e. To resolve this issue (#7716) getTotalSwapSpaceSize() API returned -1
Add jenkins user to Performance Log Users group. See AdoptOpenJDK/openjdk-infrastructure#1009
Failing on the internal nightly jdk8 win32 build. Seems to be fine on win64 across versions. 4 tests
All failing with he same error. Doesn't seem to be machine specific. 100% failure rate.
Seeing as this passes on the external build I would assume it is something on machine config side. I will just need someone to tell help me figure out what to change.