adoptium / infrastructure

This repo contains all information about machine maintenance.
Apache License 2.0
87 stars 102 forks source link

"Skipped parameter" warnings in jenkins server log #2774

Open sxa opened 2 years ago

sxa commented 2 years ago

These aren't new but were discovered while working on #2108

There are a lot of warnings in the jenkins log which we should look at clearing up in the interests of avoid the risk of being unable to see the wood for the trees. All are related to NODE_LABEL's use in various places.

Presumably the parameters are being passed in from the upstream job somewhere but they are not defined on the callee so are superfluous. Can we wasily stop 'rogue' parameters being passed in or should we set the -D mentioned in the error?

90 WARNING hudson.model.ParametersAction#filter: Skipped parameter NODE_LABEL as it is undefined on build-scripts/release/create_installer_mac. Set -Dhudson.model.ParametersAction.keepUndefinedParameters=true to allow undefined parameters to be injected as environment variables or -Dhudson.model.ParametersAction.safeParameters=[comma-separated list] to whitelist specific parameter names, even though it represents a security breach or -Dhudson.model.ParametersAction.keepUndefinedParameters=false to no longer show this message.

sxa commented 9 months ago

@sophia-guo Is there a way to trigger regeneration of some of the jobs that aren't run regularly e.g. the external ones, and the non-hotspot ones like j9? Those seem to be most of the ones in the test area that are remaining.

Did you mean the not running jobs ( not triggered to run or disabled ) are still generating warning messages?

It seems that way, yes.

sxa commented 8 months ago

@sophia-guo Is there a way to trigger regeneration of some of the jobs that aren't run regularly e.g. the external ones, and the non-hotspot ones like j9? Those seem to be most of the ones in the test area that are remaining.

Can someone confirm the above to me please? If there is an easy way to trigger this that doesn't involve running a full openjdkXX-pipeline with the aqaAutoGen flag? Can I just run https://ci.adoptium.net/job/Test_Job_Auto_Gen/build?delay=0sec with the appropriate flags for the other variants to clear them up? This seems like an easy fix if it's all that's required - are there any potential side-effects from doing that?

sxa commented 8 months ago

I've kicked off these two runs:

This will cover most of the ones we have remaining from a test perspective today, although a few others are notable in the logs too ... @smlambert @sophia-guo are these still needed and if not can you delete the jobs from the jenkins server please?

sophia-guo commented 8 months ago

I don't have any permission to delete jenkin jobs.

smlambert commented 8 months ago

We also have the problem of TRSS getting confused when we delete Jenkins jobs, so we should programatically regenerate them without deletion or else we end up with this type of problems, https://github.com/adoptium/aqa-test-tools/issues/860

sxa commented 8 months ago

I don't have any permission to delete jenkin jobs.

Which ohes would you like deleting? I can action those if required (Shelley can too!)

We also have the problem of TRSS getting confused when we delete Jenkins jobs, so we should programatically regenerate them without deletion or else we end up with this type of problems, https://github.com/adoptium/aqa-test-tools/issues/860

I believe the problem in there was caused by deletion of jobs that were still in use which then restarted from a build number of 1. In these cases I'm asking if the jobs are still required, or if they can be removed without replacement, which I do not believe is related to the problem in 860, but if I'm incorrect please let me know. If there is a specific process for the regeneration other than what I've done today in the above comment please let me know :-)

The question here is whether these jobs are required at all i.e. whether they can be removed from jenkins and TRSS (I'm unclear if any of these ones are monitored by TRSS).

sxa commented 8 months ago

Getting on top of these now I think - there were over 5000 yesterday, but as of half way through today we've only had 8. I'll do another summary over the weekend. We could do with regenerating the -hotspot build jobs I think (especially since we seem to have switched all of them off for now which I would suggest wasn't ideal unless they're running outside the triggers) .

@andrew-m-leonard is there a way to point the regen jobs at a configuration that includes the non-temurin ones?

There have also been instances of these on the pr-tester jobs which could probably do with a bit of maintenance (although they're not showing up in the log today so far)

andrew-m-leonard commented 8 months ago

So I think the other variants need to be in the pipelines targetConfiguration to be re-gen'd so If you could point the regen at a fork with such a change you could do possibly?

sxa commented 8 months ago

Yeah that's what I wanted to do but it wasn't clear how to make that work. Is there a way to point it at an alternate location or would I have to create a new branch, put the changes in, edit the generator job (always a risk!) to use the fork, and then run it ... Or is there a simpler way?

There is DEFAULTS_URL which can point externally (although the default in the description of that points to something at the old AdoptOpenJDK repo, but I don't think the equavalint file in the new repo is the one I want)

sxa commented 8 months ago

From today and yesterday (Note this may not be an exhaustive list since it's not necessarily the same each day but gives an idea. We're getting close now though!):

Build jobs (11) ``` 1 Skipped parameter `DYNAMIC_COMPILE` as it is undefined on `build-scripts/jobs/jdk8u/jdk8u-linux-x64-hotspot_SmokeTests` (#146). 1 Skipped parameter `NODE_LABEL` as it is undefined on `build-scripts/jobs/jdk11u/jdk11u-alpine-linux-aarch64-temurin` (#43). 1 Skipped parameter `NODE_LABEL` as it is undefined on `build-scripts/jobs/jdk17u/jdk17u-linux-s390x-hotspot` (#50). 1 Skipped parameter `NODE_LABEL` as it is undefined on `build-scripts/jobs/jdk8u/jdk8u-alpine-linux-aarch64-temurin` (#140). 1 Skipped parameter `NODE_LABEL` as it is undefined on `build-scripts/jobs/jdk8u/jdk8u-linux-arm-hotspot` (#830). 1 Skipped parameter `NODE_LABEL` as it is undefined on `build-scripts/jobs/jdk8u/jdk8u-linux-s390x-temurin` (#272). 1 Skipped parameter `NODE_LABEL` as it is undefined on `build-scripts/jobs/jdk8u/jdk8u-solaris-sparcv9-hotspot` (#644). 1 Skipped parameter `NODE_LABEL` as it is undefined on `build-scripts/jobs/jdk8u/jdk8u-windows-x86-32-hotspot` (#1,112). 1 Skipped parameter `NODE_LABEL` as it is undefined on `build-scripts/jobs/jdk/jdk-mac-aarch64-hotspot` (#36). 1 Skipped parameter `NODE_LABEL` as it is undefined on `build-scripts/jobs/jdk/jdk-mac-aarch64-hotspot` (#37). 1 Skipped parameter `NODE_LABEL` as it is undefined on `build-scripts/jobs/jdk/jdk-mac-aarch64-hotspot` (#38). ```
Test jobs (28) J9(16)/Dragonwell(9)/Bisheng(3) ``` 1 Skipped parameter `ACTIVE_NODE_TIMEOUT` as it is undefined on `Test_openjdk11_dragonwell_sanity.openjdk_x86-64_linux` (#459). 1 Skipped parameter `ACTIVE_NODE_TIMEOUT` as it is undefined on `Test_openjdk11_j9_extended.openjdk_ppc64le_linux_xl` (#7). 1 Skipped parameter `ACTIVE_NODE_TIMEOUT` as it is undefined on `Test_openjdk11_j9_extended.openjdk_s390x_linux_xl` (#9). 1 Skipped parameter `DYNAMIC_COMPILE` as it is undefined on `Test_openjdk11_dragonwell_sanity.openjdk_x86-64_linux` (#459). 1 Skipped parameter `DYNAMIC_COMPILE` as it is undefined on `Test_openjdk11_j9_sanity.openjdk_s390x_linux` (#674). 1 Skipped parameter `DYNAMIC_COMPILE` as it is undefined on `Test_openjdk11_j9_sanity.openjdk_x86-64_windows` (#708). 1 Skipped parameter `GENERATE_JOBS` as it is undefined on `Test_openjdk11_dragonwell_sanity.openjdk_x86-64_linux` (#459). 1 Skipped parameter `JDK_BRANCH` as it is undefined on `Test_openjdk11_dragonwell_sanity.openjdk_x86-64_linux` (#459). 1 Skipped parameter `JDK_BRANCH` as it is undefined on `Test_openjdk11_j9_extended.openjdk_ppc64le_linux_xl` (#7). 1 Skipped parameter `JDK_BRANCH` as it is undefined on `Test_openjdk11_j9_extended.openjdk_s390x_linux_xl` (#9). 1 Skipped parameter `JDK_REPO` as it is undefined on `Test_openjdk11_dragonwell_sanity.openjdk_x86-64_linux` (#459). 1 Skipped parameter `JDK_REPO` as it is undefined on `Test_openjdk11_j9_extended.openjdk_ppc64le_linux_xl` (#7). 1 Skipped parameter `JDK_REPO` as it is undefined on `Test_openjdk11_j9_extended.openjdk_s390x_linux_xl` (#9). 1 Skipped parameter `LABEL_ADDITION` as it is undefined on `Test_openjdk11_j9_extended.openjdk_ppc64le_linux_xl` (#7). 1 Skipped parameter `LABEL_ADDITION` as it is undefined on `Test_openjdk11_j9_extended.openjdk_s390x_linux_xl` (#9). 1 Skipped parameter `NON_AQA_TEST_REPOS` as it is undefined on `Test_openjdk11_bisheng_extended.openjdk_x86-64_linux_rerun` (#1). 1 Skipped parameter `NUM_MACHINES` as it is undefined on `Test_openjdk11_dragonwell_sanity.openjdk_x86-64_linux` (#459). 1 Skipped parameter `PARALLEL` as it is undefined on `Test_openjdk11_dragonwell_sanity.openjdk_x86-64_linux` (#459). 1 Skipped parameter `PLATFORM_AND_MACHINE` as it is undefined on `Test_openjdk11_bisheng_extended.openjdk_x86-64_linux_rerun` (#1). 1 Skipped parameter `RELEASE_TAG` as it is undefined on `Test_openjdk11_j9_extended.external_ppc64le_linux_quarkus_openshift` (#25). 1 Skipped parameter `RERUN_ITERATIONS` as it is undefined on `Test_openjdk11_dragonwell_sanity.openjdk_x86-64_linux` (#459). 1 Skipped parameter `RUNTIME_NAME` as it is undefined on `Test_openjdk11_j9_extended.external_ppc64le_linux_quarkus_openshift` (#25). 1 Skipped parameter `TKG_BRANCH` as it is undefined on `Test_openjdk11_j9_extended.external_ppc64le_linux_quarkus_openshift` (#25). 1 Skipped parameter `TKG_REPO` as it is undefined on `Test_openjdk11_j9_extended.external_ppc64le_linux_quarkus_openshift` (#25). 1 Skipped parameter `TKG_SHA` as it is undefined on `Test_openjdk11_bisheng_extended.openjdk_x86-64_linux_rerun` (#1). 1 Skipped parameter `USE_TESTENV_PROPERTIES` as it is undefined on `Test_openjdk11_dragonwell_sanity.openjdk_x86-64_linux` (#459). 1 Skipped parameter `USE_TESTENV_PROPERTIES` as it is undefined on `Test_openjdk11_j9_sanity.openjdk_s390x_linux` (#674). 1 Skipped parameter `USE_TESTENV_PROPERTIES` as it is undefined on `Test_openjdk11_j9_sanity.openjdk_x86-64_windows` (#708). ```
Test jobs (21) - HotSpot/Grinder ``` 1 Skipped parameter `ADOPTOPENJDK_SYSTEMTEST_BRANCH` as it is undefined on `grinder_sandbox_iteration_2` (#18). 1 Skipped parameter `ADOPTOPENJDK_SYSTEMTEST_REPO` as it is undefined on `grinder_sandbox_iteration_2` (#18). 1 Skipped parameter `ARCHIVE_TEST_RESULTS` as it is undefined on `grinder_sandbox_iteration_2` (#18). 1 Skipped parameter `DEBUG_IMAGES_REQUIRED` as it is undefined on `grinder_sandbox_iteration_2` (#18). 1 Skipped parameter `GITHUB_TOKEN` as it is undefined on `refactor_openjdk_release_tool_new` (#25). 1 Skipped parameter `NON_AQA_TEST_REPOS` as it is undefined on `Grinder_security` (#1). 1 Skipped parameter `OPENJ9_SYSTEMTEST_BRANCH` as it is undefined on `grinder_sandbox_iteration_2` (#18). 1 Skipped parameter `OPENJ9_SYSTEMTEST_REPO` as it is undefined on `grinder_sandbox_iteration_2` (#18). 1 Skipped parameter `PLATFORM_AND_MACHINE` as it is undefined on `Grinder_security` (#1). 1 Skipped parameter `RELEASE_TAG` as it is undefined on `grinder_sandbox_iteration_2` (#18). 1 Skipped parameter `RELEASE_TAG` as it is undefined on `Test_openjdk11_hs_sanity.external_x86-64_linux_openliberty-mp-tck` (#497). 1 Skipped parameter `RUNTIME_NAME` as it is undefined on `grinder_sandbox_iteration_2` (#18). 1 Skipped parameter `RUNTIME_NAME` as it is undefined on `Test_openjdk11_hs_sanity.external_x86-64_linux_openliberty-mp-tck` (#497). 1 Skipped parameter `STF_BRANCH` as it is undefined on `grinder_sandbox_iteration_2` (#18). 1 Skipped parameter `STF_REPO` as it is undefined on `grinder_sandbox_iteration_2` (#18). 1 Skipped parameter `TEST_IMAGES_REQUIRED` as it is undefined on `grinder_sandbox_iteration_2` (#18). 1 Skipped parameter `TKG_BRANCH` as it is undefined on `Test_openjdk11_hs_sanity.external_x86-64_linux_openliberty-mp-tck` (#497). 1 Skipped parameter `TKG_REPO` as it is undefined on `Test_openjdk11_hs_sanity.external_x86-64_linux_openliberty-mp-tck` (#497). 1 Skipped parameter `TKG_SHA` as it is undefined on `Grinder_scala` (#16). 1 Skipped parameter `TKG_SHA` as it is undefined on `Grinder_security` (#1). 1 Skipped parameter `USE_TESTENV_PROPERTIES` as it is undefined on `Grinder_scala` (#16). ```
sxa commented 7 months ago

The test autogen jobs seem to have two parameters causing problems too:

sxa commented 7 months ago

@smlambert @andrew-m-leonard Is there a way we can regen the jobs from the earlier comment (Plus the pr-tester ones which can show up problems)> I'm not clear on how to do this, so if you're able to trigger it and put the information in here that would be appreciated so we can clear off the relatively small number of remaining ones that we seem to have.

andrew-m-leonard commented 7 months ago

@sxa this look like job instances ? eg.build-scripts/jobs/jdk/jdk-mac-aarch64-hotspot` (#38) Can't we just delete those particular job numbers?

sxa commented 7 months ago

@sxa this look like job instances ? eg.build-scripts/jobs/jdk/jdk-mac-aarch64-hotspot` (#38) Can't we just delete those particular job numbers?

A fair comment - I hadn't actually spotted that it was on non-current ones. If you confirm you're ok with us just deleting them I'm happy to go through and do that as a background task.

That would likely just leave things like this on the PR tester jobs: 2024-05-02 08:36:36.633+0000 [id=4239618] WARNING hudson.model.ParametersAction#filter: Skipped parameter NODE_LABEL as it is undefined on build-scripts-pr-tester/build-test/jobs/jdk/jdk-alpine-linux-x64-temurin (#9). Set -Dhudson.model.ParametersAction.keepUndefinedParameters=true to allow undefined parameters to be injected as environment variables or -Dhudson.model.ParametersAction.safeParameters=[comma-separated list] to whitelist specific parameter names, even though it represents a security breach or -Dhudson.model.ParametersAction.keepUndefinedParameters=false to no longer show this message.

sxa commented 7 months ago

Test jobs (28) J9(16)/Dragonwell(9)/Bisheng(3)

From a comment from Shelley in a meeting earlier today it seems likely that running https://ci.eclipse.org/temurin-compliance/job/AQA_Test_Pipeline with a VARIANT set appropriately and AUTO_AQA_GEN checked

(Although I'm not sure if this will regen the _quarkus_openshift, _xl and _rerun jobs so it would be good to have a way of doing that)

Similarly I'm unclear if we have a way to regen the various Grinder variants (although almost all of those in the twisty earlier were grinder_sandbox_iteration_2

sxa commented 7 months ago

Runs - noting that the AQA_Test_pipeline job does not have an option for SDK_RESOURCE=upstream so I'm having to locate a suitable customized URL for these (Although worst case they just fail but the regen will still have occured by that point!)

This will cover most of the outstanding test jobs other than Grinders, _rerun and the specialist external ones.

sxa commented 7 months ago

Noting also that we're getting a couple of the parameters giving warnings on Test_Job_Auto_Gen as called from AQA_Test_Pipeline which will likely require some sort of manual remediation:

(I've had 3100 of those lines from Test_Job_Auto_Gen in the last two hours since I kicked off those jobs in the previous comment)

smlambert commented 7 months ago

FYI, TEST_JOB_NAME only applies to AQA_Test_Pipeline, there is no parameter of that name used in the generated jobs (nor do we want their to be).

smlambert commented 7 months ago

Similarly I'm unclear if we have a way to regen the various Grinder variants (although almost all of those in the twisty earlier were grinder_sandbox_iteration_2

I have been slowly removing grinder variants especially if they were last run 1 yr or more ago

sxa commented 7 months ago

FYI, TEST_JOB_NAME only applies to AQA_Test_Pipeline, there is no parameter of that name used in the generated jobs (nor do we want their to be).

Makes sense, although the warning would suggest that AQA_Test_pipeline is trying to pass that parametr down to Test_Job_Auto_Gen somwhere.

smlambert commented 7 months ago

Makes sense, although the warning would suggest that AQA_Test_pipeline is trying to pass that parametr down to Test_Job_Auto_Gen somwhere.

The code cycles through all parameters and passes them down (with some exceptions at https://github.com/adoptium/aqa-tests/blob/master/buildenv/jenkins/aqaTestPipeline.groovy#L110-L114). It can be added to the set of exceptions that are not added to childParams so that when the downstreamJob is created (at https://github.com/adoptium/aqa-tests/blob/master/buildenv/jenkins/aqaTestPipeline.groovy#L147), the content of childParams is without it.

sxa commented 7 months ago

Deep dive into one of them (Corretto JDK8 sanity/system) for reference. There are only 13 entries relating to test skipped parameters today so far, and 8 of them are from different parameters on this job so yesterday's regens appear to have made quite a big difference.

sxa commented 6 months ago

I have been slowly removing grinder variants especially if they were last run 1 yr or more ago

@smlambert Can "suffixed" jobs like https://ci.adoptium.net/job/Test_openjdk11_hs_sanity.external_s390x_linux_system-test/ be removed too? The ones I've looked at today seem to be ones that haven't been run in nearly 3 years so I guess they were likely generated as some sort of experiment and are no longer required.

I don't want to be deleting them myself though - would rather someone from the test side with knowledge of the particular jobs that should be in scope was able to handle it.

smlambert commented 6 months ago

I have a script that will look for jobs that have not been run in XX number of months, and optionally delete them if 'deleteJobs' parameter equals true. I think we should run such a job occasionally on the server to cull old, not-used jobs. I will do a pass in the coming weeks.