Closed fjeremic closed 4 years ago
@smlambert @llxia FYI. I'm more than happy to help with any of the above.
For a particular test failure it is non-obvious to which test bucket the test belongs to
It is in the link i.e. https://ci.eclipse.org/openj9/job/Test-extended.system-JDK8-linux_390-64_cmprssptrs/181/
extended.system
indicates system testing.
but where do I download this build?
You need to look at the parent job(s) of the test failure, find the build job for the platform, and the jvm is an artifact of the job. It is only available for a short time, depending on how many other builds are run, as we have limited space.
I will try to clarify some of the questions.
It is not clear from a failed test what JVM command line options were used Example: https://ci.eclipse.org/openj9/job/Test-extended.system-JDK8-linux_390-64_cmprssptrs/181/
TKG does print info at beginning of the tests with JVM_OPTIONS
:
===============================================
Running test SharedClassesAPI_0 ...
===============================================
SharedClassesAPI_0 Start Time: Thu Jan 31 03:14:37 2019 Epoch Time (ms): 1548922477290
variation: NoOptions
JVM_OPTIONS: -Xcompressedrefs
As a result it is not clear how to add EXTRA_OPTIONS or JVM_OPTIONS to a Grinder
It is documented in: https://github.com/AdoptOpenJDK/openjdk-tests/wiki/How-to-Run-a-Grinder-Build-on-Jenkins https://github.com/eclipse/openj9/blob/master/test/docs/OpenJ9TestUserGuide.md
For a particular test failure it is non-obvious to which test bucket the test belongs to Is it functional? Is it systemtest? Is it some Adopt test? Other than going to the ~4 different repos and searching for the test name is there a better way to know? Need to know this so one can fill in the BUILD_LIST in the Grinder
The information is in the job name. For example, Test-extended.functional-JDK11-linux_x86-64_cmprssptrs means running extended
functional
test using JDK11
on linux_x86-64_cmprssptrs
. For system test, you should see system
in the job name.
From a test failure looking at the java -version output it is not clear where to download the JDK For example I know the SHA numbers and the build date and number too "20190130_208 (JIT enabled, AOT enabled)" but where do I download this build?
In Openj9 Jenkins, we can get parameters from test build https://ci.eclipse.org/openj9/view/Test/job/Test-extended.functional-JDK11-linux_x86-64_cmprssptrs/169/parameters/
It shows UPSTREAM_JOB_NAME
and UPSTREAM_JOB_NUMBER
. We should be able to find the build in Jenkins Build tab. Once we find the exact JDK build, we can Copy Link Address
of the archived JDK.
Or another way to do this https://github.com/eclipse/openj9/issues/3697#issuecomment-439166452
Can see the curl command from the Grinder so I can find the JDK from there but why shouldn't we be able to extract the build ID from java -version somehow? If someone just pasted me the java -version output I would have no idea how to grab that same build, and that is a problem.
The grinder can take JDK from any public url (i.e., AdoptOpenJDK, Artifactory, etc). We may not have enough infromation to determine OpenJ9 build ID.
STF tests don't show crash information in the console output Have to dig through the artifacts to find it always which is time consuming
@Mesbah-Alam we may need to update STF to handle this.
Why do we need to export JDK_VERSION when running tests? Should we be able to determine that from $JAVA_BIN/java -version output?
We are working on this https://github.com/eclipse/openj9/issues/442. The idea is we do not need to provide JDK_VERSION, JDK_IMPL and SPEC. All the information can be auto-detected when JAVA_BIN is provided.
When attempting to reproduce [1] locally the make compile command from instructions from [2] seems to compile all tests (sometimes comping ~6000 java source files) however a Grinder launched for the same test seems only compile and run the one specific test [3]. Why is that? How can I locally do thet same thing as the Grinder? i.e. I only want to compile and run my one test I care about.
We can use BUILD_LIST
to narrow down to the folder that we care about. This is documented in FAQ
Maybe we should add a link to FAQ in https://github.com/eclipse/openj9/wiki/Reproducing-Test-Failures-Locally
Note: this feature only works for subdirs in functional atm. Support for systemtest is on the way.
Tests seem to run in huge buckets per one Jenkins job as opposed to much smaller buckets per Jenkins job. This makes re-running a test tedious and involves a lot of manual work as opposed to VMFarm-esque clicking a "Re-run on Grinder" button and launching a reproduction batch.
Test job does not have all parameters defined in config, so rebuild maynot work. One thing that is on our to-do list is to auto generate test jobs, so that we can avoid this issue.
Grinder tests are sequential which is very time consuming when it comes to reproducing issues which are intermittent (1/50 failures take several hours as opposed to a few minutes to reproduce) Sometimes a failure occurs in the middle of a grinder, say the 5th job out of 50. Is there a way to "kill" the grinder and just get the data for the failure at that point without running through the other 45 iterations?
Issue is created https://github.com/AdoptOpenJDK/openjdk-tests/issues/836 Once parallel is enabled, iteration 50 means start 50 separate jobs. We can kill any of them in the middle of the grinder.
Getting machine access is non-trivial (impossible?) which makes reproducing issues which only appear to happen on farm machines very difficult
Unfortunately, test team does not have control of the machines access. fyi @jdekonin
It is in the link i.e. ci.eclipse.org/openj9/job/Test-extended.system-JDK8-linux_390-64_cmprssptrs/181
extended.system
indicates system testing.
Right, this is also obvious from the test name. So where do I find this test? Is "extended.system" == "systemtest"? That part is confusing, at least to me.
You need to look at the parent job(s) of the test failure, find the build job for the platform, and the jvm is an artifact of the job. It is only available for a short time, depending on how many other builds are run, as we have limited space.
Using your example: https://ci.eclipse.org/openj9/job/Test-extended.system-JDK8-linux_390-64_cmprssptrs/181/
I navigate to "build number 850", then to "build number 383", then I seem to be at the top level for this nightly build: https://ci.eclipse.org/openj9/job/Pipeline-Build-Test-All/383/
I fail to see how to navigate to the build artifact you describe. Can you describe the steps from here?
It is documented in: AdoptOpenJDK/openjdk-tests/wiki/How-to-Run-a-Grinder-Build-on-Jenkins /test/docs/OpenJ9TestUserGuide.md@
master
There are quirks. For example it is non-obvious how to input the following command:
-Xjit:{java/lang/SomeClass.foo()I}(tracefull,log=foo.trace)
Through experimentation and help from others it seems you have to double quote the full command and escape the quotes, so the actual thing you have to input is:
\"-Xjit:{java/lang/SomeClass.foo()I}(tracefull,log=foo.trace)\"
In Openj9 Jenkins, we can get parameters from test build ci.eclipse.org/openj9/view/Test/job/Test-extended.functional-JDK11-linux_x86-64_cmprssptrs/169/parameters
It shows
UPSTREAM_JOB_NAME
andUPSTREAM_JOB_NUMBER
. We should be able to find the build in Jenkins Build tab. Once we find the exact JDK build, we canCopy Link Address
of the archived JDK.Or another way to do this #3697 (comment)
Neither of these seem to work for the test failure example at hand from #4526:
Navigating to the build artifact from a test failure would be good to know, however my original question was if there is a way to navigate to the build artifact only using the java -version
output which I can always find inside of a test failure console log.
We can use
BUILD_LIST
to narrow down to the folder that we care about. This is documented in FAQ Maybe we should add a link to FAQ in eclipse/openj9/wiki/Reproducing-Test-Failures-Locally Note: this feature only works for subdirs in functional atm. Support for systemtest is on the way.
Ah I see, I think I encountered the systemtest limitation here then.
Thanks for all the answers!
I fail to see how to navigate to the build artifact you describe. Can you describe the steps from here?
You do not need to get to build number 383
. The information is in console output of build number 850
Hopefully, this comment lists the steps clearly https://github.com/eclipse/openj9/issues/3697#issuecomment-439166452
JDK build 1178 passed but not having JDK archived. I do see tar
command in the console.
https://ci.eclipse.org/openj9/view/Build/job/Build-JDK8-linux_390-64_cmprssptrs/1178/console
And the next nightly build have JDK archived https://ci.eclipse.org/openj9/view/Build/job/Build-JDK8-linux_390-64_cmprssptrs/1184/
@AdamBrousseau Is there a limitaton on how long the artifacts are kept?
You do not need to get to
build number 383
. The information is in console output ofbuild number 850
Hopefully, this comment lists the steps clearly #3697 (comment)
Right but there is no archive link anywhere.
JDK build 1178 passed but not having JDK archived. I do see
tar
comment in the console. ci.eclipse.org/openj9/view/Build/job/Build-JDK8-linux_390-64_cmprssptrs/1178/console
Yeah it says:
23:51:16 ARTIFACTORY server is not set saving artifacts on jenkins.
Not sure why it worked on the very next build. Does that mean I can't get a hold of the exact binary JDK package used from that build? (without having to rebuild the entire JDK using the SHAs)
We only have space to keep 10 artifacts per build at the moment
We should be able to pass in the build number and have it appear in the -version output, fairly certain there is a configure parameter to allow this. In this part below I think we could change the +0
to be the build number.
(build 11.0.2-internal+0-adhoc.jenkins.Build-JDK11-linuxx86-64cmprssptrs)
A lot of these items have now been addressed through several major updates and features added, including and not limited to:
variation (from playlist) and JVM_OPTIONS used are printed at the start of each test run Example console output: 15:27:52 variation: NoOptions 15:27:52 JVM_OPTIONS: -Xcompressedrefs
Re-run link for easier prepopulation of Grinder parameters
AUTODETECT, so if you use customized/upstream SDK_RESOURCE, you no longer need to tell TKG what JDK_VERSION/JDK_IMPL it is
removed the "make -f run_configure.mk" step, to simplify test runs even further
better doc to ensure developers know utilize BUILD_LIST to control which directories to be compiled
new logical target called _testList (to allow a custom list of test targets to be passed to TKG, therefore to a Grinder/test job/workflow)
rename all directories in openjdk-tests repo to match the test group names (system == system, external == external, etc)
simplification of using get.sh (can now simply clone openjdk-tests, export TEST_JDK_HOME=/whereEverYouPutYourJDK and then run get.sh with no arguments)
centralization of test doc can be tracked via https://github.com/AdoptOpenJDK/openjdk-tests/issues/1558
smart parallelization work can be tracked via https://github.com/AdoptOpenJDK/openjdk-tests/issues/1563
Most other items have been addressed in comments above. The suggested enhancement to STF output should be raised against the STF repo, though I do not believe it will get any priority (no resources to spare), and that STF output is already too verbose (would want to reduce noise, before adding new 'content' to the output stream).
Given all of that, I believe we can/should close this issue, @fjeremic ?
Agreed. Many thanks to the test team who invested resource into fixing most of these issues. I certainly have observed the improvements and am very grateful for the investment in this area. Thank you!
While attempting to launch Grinders and reproduce #4526 locally I kept notes of some of the quirks I encountered or issues I stumbled upon. Some of these have been detailed in the various documentation, others are unspecified. I hope this feedback can be used to improve our documentation and/or processes for debugging:
Pain Points:
java -version
output it is not clear where to download the JDKcurl
command from the Grinder so I can find the JDK from there but why shouldn't we be able to extract the build ID fromjava -version
somehow? If someone just pasted me thejava -version
output I would have no idea how to grab that same build, and that is a problem.$JAVA_BIN/java -version
output?make compile
command from instructions from [2] seems to compile all tests (sometimes comping ~6000 java source files) however a Grinder launched for the same test seems only compile and run the one specific test [3]. Why is that? How can I locally do thet same thing as the Grinder? i.e. I only want to compile and run my one test I care about.[1] https://github.com/eclipse/openj9/issues/4526 [2] https://github.com/eclipse/openj9/wiki/Reproducing-Test-Failures-Locally#run-sanity-system-tests-on-jdk10_x86-64_linux_openj9-sdk [3] https://hyc-runtimes-jenkins.swg-devops.com/view/Test_grinder/job/Grinder/1363/consoleText
Issues:
./get.sh
without settingJAVA_BIN
first which seems to be step 6 so the instructions need to get updatedJAVA_HOME
to be../../
fromJAVA_BIN
, so it meansJAVA_BIN
has to be the "jre/bin" directory, bot the "bin" directory. This is non-obvious.make _sanity.system
does not work due to class loadernet.adoptopenjdk.stf.runner.StfClassLoader
exception being thrown. Exporting bothJAVA_BIN
andJAVA_HOME
does not seem to work on s390 Linux as implied by other people.The issue seems to be that System tests define
JAVA_HOME
themselves by exporting$JAVA_BIN/../../
, however the instructions in [1] specifyJAVA_BIN
should be/someLocation/bin
, which appears to be incorrect as the instructions state to download/unpack the SDK to/someLocation
.System tests seem to expect
JAVA_BIN
to be/someLocation/jre/bin
, not/someLocation/bin
. After changing this, rerunningmake -f run_configure.mk
andmake compile
things seem to work now.make SharedClassesAPI
however this results in errorsmake _SharedClassesAPI_0
?General Feedback: