Open sxa opened 2 years ago
@sophia-guo Is there a way to trigger regeneration of some of the jobs that aren't run regularly e.g. the external ones, and the non-hotspot ones like j9? Those seem to be most of the ones in the test area that are remaining.
Did you mean the not running jobs ( not triggered to run or disabled ) are still generating warning messages?
It seems that way, yes.
@sophia-guo Is there a way to trigger regeneration of some of the jobs that aren't run regularly e.g. the external ones, and the non-hotspot ones like j9? Those seem to be most of the ones in the test area that are remaining.
Can someone confirm the above to me please? If there is an easy way to trigger this that doesn't involve running a full openjdkXX-pipeline with the aqaAutoGen flag? Can I just run https://ci.adoptium.net/job/Test_Job_Auto_Gen/build?delay=0sec with the appropriate flags for the other variants to clear them up? This seems like an easy fix if it's all that's required - are there any potential side-effects from doing that?
I've kicked off these two runs:
This will cover most of the ones we have remaining from a test perspective today, although a few others are notable in the logs too ... @smlambert @sophia-guo are these still needed and if not can you delete the jobs from the jenkins server please?
Test_openjdk11_dragonwell_sanity.openjdk_x86-64_linux_rerun
is in the list too which surprises me since it is a relatively new job (For reference, it's giving the error with IS_PARALLEL
, RELEASE_TAG
, TKG_BRANCH
, TKG_REPO
and TKG_SHA
I've also run a Grinder job with dynamic parallel 5 which should regenerate the GrindertestList[0-4] jobs which were also being flagged.I don't have any permission to delete jenkin jobs.
We also have the problem of TRSS getting confused when we delete Jenkins jobs, so we should programatically regenerate them without deletion or else we end up with this type of problems, https://github.com/adoptium/aqa-test-tools/issues/860
I don't have any permission to delete jenkin jobs.
Which ohes would you like deleting? I can action those if required (Shelley can too!)
We also have the problem of TRSS getting confused when we delete Jenkins jobs, so we should programatically regenerate them without deletion or else we end up with this type of problems, https://github.com/adoptium/aqa-test-tools/issues/860
I believe the problem in there was caused by deletion of jobs that were still in use which then restarted from a build number of 1. In these cases I'm asking if the jobs are still required, or if they can be removed without replacement, which I do not believe is related to the problem in 860, but if I'm incorrect please let me know. If there is a specific process for the regeneration other than what I've done today in the above comment please let me know :-)
The question here is whether these jobs are required at all i.e. whether they can be removed from jenkins and TRSS (I'm unclear if any of these ones are monitored by TRSS).
Getting on top of these now I think - there were over 5000 yesterday, but as of half way through today we've only had 8. I'll do another summary over the weekend. We could do with regenerating the -hotspot
build jobs I think (especially since we seem to have switched all of them off for now which I would suggest wasn't ideal unless they're running outside the triggers) .
@andrew-m-leonard is there a way to point the regen jobs at a configuration that includes the non-temurin ones?
There have also been instances of these on the pr-tester jobs which could probably do with a bit of maintenance (although they're not showing up in the log today so far)
So I think the other variants need to be in the pipelines targetConfiguration to be re-gen'd so If you could point the regen at a fork with such a change you could do possibly?
Yeah that's what I wanted to do but it wasn't clear how to make that work. Is there a way to point it at an alternate location or would I have to create a new branch, put the changes in, edit the generator job (always a risk!) to use the fork, and then run it ... Or is there a simpler way?
There is DEFAULTS_URL which can point externally (although the default in the description of that points to something at the old AdoptOpenJDK repo, but I don't think the equavalint file in the new repo is the one I want)
From today and yesterday (Note this may not be an exhaustive list since it's not necessarily the same each day but gives an idea. We're getting close now though!):
The test autogen jobs seem to have two parameters causing problems too:
2024-05-03 12:37:35.013+0000 [id=4318864] WARNING hudson.model.ParametersAction#filter: Skipped parameter
LIGHT_WEIGHT_CHECKOUTas it is undefined on
Test_Job_Auto_Gen(#5,090). Set
-Dhudson.model.ParametersAction.keepUndefinedParameters=trueto allow undefined parameters to be injected as environment variables or
-Dhudson.model.ParametersAction.safeParameters=[comma-separated list]to whitelist specific parameter names, even though it represents a security breach or
-Dhudson.model.ParametersAction.keepUndefinedParameters=falseto no longer show this message.
2024-05-03 12:37:35.014+0000 [id=4318864] WARNING hudson.model.ParametersAction#filter: Skipped parameter
TEST_JOB_NAMEas it is undefined on
Test_Job_Auto_Gen(#5,090). Set
-Dhudson.model.ParametersAction.keepUndefinedParameters=trueto allow undefined parameters to be injected as environment variables or
-Dhudson.model.ParametersAction.safeParameters=[comma-separated list]to whitelist specific parameter names, even though it represe
@smlambert @andrew-m-leonard Is there a way we can regen the jobs from the earlier comment (Plus the pr-tester ones which can show up problems)> I'm not clear on how to do this, so if you're able to trigger it and put the information in here that would be appreciated so we can clear off the relatively small number of remaining ones that we seem to have.
@sxa this look like job instances ? eg.build-scripts/jobs/jdk/jdk-mac-aarch64-hotspot` (#38) Can't we just delete those particular job numbers?
@sxa this look like job instances ? eg.build-scripts/jobs/jdk/jdk-mac-aarch64-hotspot` (#38) Can't we just delete those particular job numbers?
A fair comment - I hadn't actually spotted that it was on non-current ones. If you confirm you're ok with us just deleting them I'm happy to go through and do that as a background task.
That would likely just leave things like this on the PR tester jobs:
2024-05-02 08:36:36.633+0000 [id=4239618] WARNING hudson.model.ParametersAction#filter: Skipped parameter NODE_LABEL as it is undefined on build-scripts-pr-tester/build-test/jobs/jdk/jdk-alpine-linux-x64-temurin (#9). Set -Dhudson.model.ParametersAction.keepUndefinedParameters=true to allow undefined parameters to be injected as environment variables or -Dhudson.model.ParametersAction.safeParameters=[comma-separated list] to whitelist specific parameter names, even though it represents a security breach or -Dhudson.model.ParametersAction.keepUndefinedParameters=false
to no longer show this message.
Test jobs (28) J9(16)/Dragonwell(9)/Bisheng(3)
From a comment from Shelley in a meeting earlier today it seems likely that running https://ci.eclipse.org/temurin-compliance/job/AQA_Test_Pipeline with a VARIANT
set appropriately and AUTO_AQA_GEN
checked
(Although I'm not sure if this will regen the _quarkus_openshift
, _xl
and _rerun
jobs so it would be good to have a way of doing that)
Similarly I'm unclear if we have a way to regen the various Grinder variants (although almost all of those in the twisty earlier were grinder_sandbox_iteration_2
Runs - noting that the AQA_Test_pipeline job does not have an option for SDK_RESOURCE=upstream
so I'm having to locate a suitable customized
URL for these (Although worst case they just fail but the regen will still have occured by that point!)
This will cover most of the outstanding test jobs other than Grinders, _rerun
and the specialist external ones.
Noting also that we're getting a couple of the parameters giving warnings on Test_Job_Auto_Gen
as called from AQA_Test_Pipeline
which will likely require some sort of manual remediation:
2024-05-07 16:43:03.074+0000 [id=85459] WARNING hudson.model.ParametersAction#filter: Skipped parameter LIGHT_WEIGHT_CHECKOUT as it is undefined on Test_Job_Auto_Gen (#5,239)...
2024-05-07 16:43:03.076+0000 [id=85459] WARNING hudson.model.ParametersAction#filter: Skipped parameter TEST_JOB_NAME as it is undefined on Test_Job_Auto_Gen (#5,239)...
(I've had 3100 of those lines from Test_Job_Auto_Gen in the last two hours since I kicked off those jobs in the previous comment)
FYI, TEST_JOB_NAME only applies to AQA_Test_Pipeline, there is no parameter of that name used in the generated jobs (nor do we want their to be).
Similarly I'm unclear if we have a way to regen the various Grinder variants (although almost all of those in the twisty earlier were grinder_sandbox_iteration_2
I have been slowly removing grinder variants especially if they were last run 1 yr or more ago
FYI, TEST_JOB_NAME only applies to AQA_Test_Pipeline, there is no parameter of that name used in the generated jobs (nor do we want their to be).
Makes sense, although the warning would suggest that AQA_Test_pipeline
is trying to pass that parametr down to Test_Job_Auto_Gen
somwhere.
Makes sense, although the warning would suggest that AQA_Test_pipeline is trying to pass that parametr down to Test_Job_Auto_Gen somwhere.
The code cycles through all parameters and passes them down (with some exceptions at https://github.com/adoptium/aqa-tests/blob/master/buildenv/jenkins/aqaTestPipeline.groovy#L110-L114). It can be added to the set of exceptions that are not added to childParams
so that when the downstreamJob
is created (at https://github.com/adoptium/aqa-tests/blob/master/buildenv/jenkins/aqaTestPipeline.groovy#L147), the content of childParams is without it.
Deep dive into one of them (Corretto JDK8 sanity/system) for reference. There are only 13 entries relating to test skipped parameters today so far, and 8 of them are from different parameters on this job so yesterday's regens appear to have made quite a big difference.
JDK_REPO
JDK_BRANCH
PARALLEL
NUM_MACHINES
USE_TESTENV_PROPERTIES
GENERATE_JOBS
ACTIVE_NODE_TIMEOUT
and DYNAMIC_COMPILE
config.xml
for the job is here in case it's useful for future historic reference (it's quite big since it has reference to adoptopenjdk repository - the last run of that job was April 14th 2022):
2774.Test_openjdk8_corretto_sanity.system_x86-64_linux.diff.txtI have been slowly removing grinder variants especially if they were last run 1 yr or more ago
@smlambert Can "suffixed" jobs like https://ci.adoptium.net/job/Test_openjdk11_hs_sanity.external_s390x_linux_system-test/ be removed too? The ones I've looked at today seem to be ones that haven't been run in nearly 3 years so I guess they were likely generated as some sort of experiment and are no longer required.
I don't want to be deleting them myself though - would rather someone from the test side with knowledge of the particular jobs that should be in scope was able to handle it.
I have a script that will look for jobs that have not been run in XX number of months, and optionally delete them if 'deleteJobs' parameter equals true. I think we should run such a job occasionally on the server to cull old, not-used jobs. I will do a pass in the coming weeks.
These aren't new but were discovered while working on #2108
There are a lot of warnings in the jenkins log which we should look at clearing up in the interests of avoid the risk of being unable to see the wood for the trees. All are related to
NODE_LABEL
's use in various places.Presumably the parameters are being passed in from the upstream job somewhere but they are not defined on the callee so are superfluous. Can we wasily stop 'rogue' parameters being passed in or should we set the
-D
mentioned in the error?90 WARNING hudson.model.ParametersAction#filter: Skipped parameter
NODE_LABEL
as it is undefined onbuild-scripts/release/create_installer_mac
. Set-Dhudson.model.ParametersAction.keepUndefinedParameters=true
to allow undefined parameters to be injected as environment variables or-Dhudson.model.ParametersAction.safeParameters=[comma-separated list]
to whitelist specific parameter names, even though it represents a security breach or-Dhudson.model.ParametersAction.keepUndefinedParameters=false
to no longer show this message.