adoptium / aqa-tests

Home of test infrastructure for Adoptium builds
https://adoptium.net/aqavit
Apache License 2.0
132 stars 312 forks source link

trapme: alternate aqa-tests branches cause failure except on Grinder jobs #5439

Open sxa opened 3 months ago

sxa commented 3 months ago

Running using a custom ADOPTOPENJDK_REPO and ADOPTOPENJDK_BRANCH results in failures on the main test jobs, although not on Grinders. This used to work but now gives a rather cryptic error message to the user:

hudson.plugins.git.GitException: Command "git fetch --tags --force --progress --prune -- origin +refs/heads/refine_concurrency_logic:refs/remotes/origin/refine_concurrency_logic" returned status code 128:
stdout: 
stderr: fatal: couldn't find remote ref refs/heads/refine_concurrency_logic

It seems likely that this changes as a result of Likely started when https://github.com/adoptium/aqa-tests/pull/5204 was merged. I, and other temurin committers (slack thread 1, slack thread 2, ) have tripped over this, so if we plan to keep this behavior and leave the parameter on the jobs we should aim to give some sort of warning to the user under these circumstances to give them a pointer as to why it has happened and how to resolve it.

sophia-guo commented 1 month ago

This is not related with #5204. It is expected due to https://issues.jenkins.io/browse/JENKINS-42971 ( note which shows resolved but actually not). With LIGHT_WEIGHT_CHECKOUT=true Pipeline script from SCM does not expand Repository URL.

All the non grinder jobs( nightly or release test jobs) are generated with LIGHT_WEIGHT_CHECKOUT=true and Repository URL set explicitly. Those non-grinder jobs are not supposed to run by custom ADOPTOPENJDK_REPO. Running with ADOPTOPENJDK_REPO is suggested to use Grinder job.

sxa commented 1 month ago

This is not related with #5204. It is expected due to https://issues.jenkins.io/browse/JENKINS-42971 ( note which shows resolved but actually not). With LIGHT_WEIGHT_CHECKOUT=true Pipeline script from SCM does not expand Repository URL.

OK thanks - a quick search on the repo made it look like 5204 was changing the lightweight checkout parameter and it seemed to be around the time the repository options stopped working for me which is why I thought that was where it was introduced.

All the non grinder jobs( nightly or release test jobs) are generated with LIGHT_WEIGHT_CHECKOUT=true and Repository URL set explicitly. Those non-grinder jobs are not supposed to run by custom ADOPTOPENJDK_REPO. Running with ADOPTOPENJDK_REPO is suggested to use Grinder job.

In that case we should look at having one or both of the following (especially given that this used to work in the past):

adamfarley commented 1 month ago

FYI: I pitched a PR to add a helpful comment to the job description (re ADOPTOPENJDK_REPO immutability) a few months ago, but the community was against it, so the PR was closed without merging. https://github.com/adoptium/aqa-tests/pull/5135

sxa commented 1 month ago

@smlambert Would you consider re-opening @adamfarley 's PR, as I feel this would be a good enhancement that would have saved me (at least twice now since I hit it this week again) spending time debugging a non-obvious failure with the parameter? It strikes me as quite a desirable change to the parameter documentation.

smlambert commented 1 month ago

The wording will have to change AND can we agree that it is not best practice to rerun an official test job and change these parameters (which is why Grinder jobs and convenience links exist). And this is all due to a Jenkins bug, which when fixed will nullify the need to have changed the 'tip' in the job config.

smlambert commented 1 month ago

Note that our Test_Job_Auto_Gen did not have LIGHT_WEIGHT_CHECKOUT parameter present, so newly generated child jobs did not get that parameter percolated down to them. It is now added so that child jobs from Grinders or AQA_Test_Pipeline now behave as expected on ci.adoptium.net ( FYI @sophia-guo )

sxa commented 1 month ago

The wording will have to change AND can we agree that it is not best practice to rerun an official test job and change these parameters

Thanks - I'm ok with that as long as it's clear from the field descriptions what won't work as an end-user might expect it to :-) It's partly the fact it's changed (since I believe it used to work ok until a few months back or thereabouts) which I think is causing particular confusion here (although if we're talking about the aforementioned https://issues.jenkins.io/browse/JENKINS-42971 that dates from 2017 but I guess one of our PRs must have done something which has now caused it to be exposed more recently.

And this is all due to a Jenkins bug, which when fixed will nullify the need to have changed the 'tip' in the job config.

I also note that Sophia mentioned earlier that the upstream bug "shows resolved but actually not" - perhaps we should add a note to that effect to the issue or raise another if it has not solved the issue for our case?

sophia-guo commented 1 month ago

https://github.com/adoptium/aqa-tests/issues/5439#issuecomment-2303227575

@smlambert thanks! Though I noticed that all jobs newly generated by Test_Job_Auto_Gen ( triggered by upstream build pipeline, not Grinders nor AQA_Test_Pipeline ) are configured by LIGHT_WEIGHT_CHECKOUT=false. https://ci.adoptium.net/view/Test_grinder/job/Test_Job_Auto_Gen/

https://ci.adoptium.net/view/Test_grinder/job/Test_openjdk11_hs_dev.functional_ppc64_aix/jobConfigHistory/showDiffFiles?timestamp1=2024-08-15_05-13-10&timestamp2=2024-08-22_05-10-41

sophia-guo commented 1 month ago

Test_Job_Auto_Gen added LIGHT_WEIGHT_CHECKOUT parameter, the default value is not true anymore https://github.com/adoptium/aqa-tests/blob/master/buildenv/jenkins/testJobTemplate#L55

If we need the official jobs keep with LIGHT_WEIGHT_CHECKOUT=true this PR is needed https://github.com/adoptium/ci-jenkins-pipelines/pull/1102.

sophia-guo commented 1 month ago

If AQA_Test_Pipeline is trigged with personal repo&branch and LIGHT_WEIGHT_CHECKOUT=false all those parameters will be passed down to child jobs with LIGHT_WEIGHT_CHECKOUT=false ( those child jobs are the same jobs generated|triggered by upstream build jobs)

releaseType == 'Weekly' job with aqaAutoGen enabled can regen the jobs and override the LIGHT_WEIGHT_CHECKOUT=true.

Though normally release jobs won't trigger with aqaAutoGen enabled so if between weekly and release AQA_Test_Pipeline is triggered with personal repo&branch and LIGHT_WEIGHT_CHECKOUT=false release will use test jobs with configuration same as the ones generated by AQA_Test_Pipeline.

This may not happen at all but with the possibility. If we later completely move to trigger test jobs with AQA_Test_Pipeline only it will not be the issue at all.