Closed dduportal closed 2 years ago
As underlined by @daniel-beck (thanks!) the error is caused by a safety mechanism in the groovy script executed by this pipeline, which fail when the certificate expiration is coming soon:
[2022-05-21T12:20:29.237Z] Caught: java.io.IOException: Failed to create one or more output files.
[2022-05-21T12:20:29.237Z] java.io.IOException: Failed to create one or more output files.
[2022-05-21T12:20:29.237Z] at dotnetSdk.addFailure(dotnetSdk.groovy:41)
[2022-05-21T12:20:29.237Z] at dotnetSdk.run(dotnetSdk.groovy:61)
[2022-05-21T12:20:29.237Z] at runner$_run_closure1.doCall(runner.groovy:13)
[2022-05-21T12:20:29.237Z] at runner.run(runner.groovy:10)
[2022-05-21T12:20:29.237Z] Suppressed: java.security.cert.CertificateExpiredException: NotAfter: Tue Jun 14 10:42:00 UTC 2022
Hello @daniel-beck @olblak @timja could you help us understanding this whole "certificate almost expired to be rotated". It seems that it is a certificate somewhere in the update center code, but it's not clear which/what/how?
Searched the following issues from the past, but I honestly understand nothing to the involved components:
I don't see anything related tp the certificate in https://github.com/jenkins-infra/crawler.
What did I miss?
It’ll be the update center signing certificate
this certificate impact also update-center within trusted-ci. This is now a TOP priority for the infra team.
A bit more context to help us understand which certificate is which:
-2
new nameSo, as per https://github.com/jenkins-infra/update-center2/tree/master/resources/certificates#jenkins-update-center-root-ca-2's readme, only @oleg-nenashev @olblak and @kohsuke have the key of the (2021) new CA: without their help we won't be able to generate a new UC certificate.
Thanks to the help of @olblak , whom generated a new update center certificate and uploaded the credentials to trusted, both builds are working again as expected.
ToDo list before closing:
Filling the "plugin releases gap"
Following the (private link) procedure at https://github.com/jenkins-infra/runbooks/tree/main/updates, we got the list of plugins released since the past 72 hours (the update-center job fails since ~24 hours):
{"releases":[{"name":"azure-vm-agents","version":"815.vf2f07da070ee"},{"name":"cas-plugin","version":"1.6.2"},{"name":"checkmarx-ast-scanner","version":"2.0.11-274.va_d38ce3e7a_35"},{"name":"codescene","version":"1.5.7"},{"name":"dark-theme","version":"185.v276b_5a_8966a_e"},{"name":"ecutest","version":"2.34"},{"name":"eggplant-runner","version":"0.0.1.108.v32f1564b_19d0"},{"name":"elastic-axis","version":"1.6.0"},{"name":"influxdb","version":"3.2.1"},{"name":"jenkins-multijob-plugin","version":"611.v9d3180d752e6"},{"name":"jenkinsci-appspider-plugin","version":"1.0.15"},{"name":"jobConfigHistory","version":"1146.v94c2521f9213"},{"name":"junit","version":"1119.va_a_5e9068da_d7"},{"name":"junit-attachments","version":"101.v82f494a_00e9e"},{"name":"kubernetes","version":"3600.v144b_cd192ca_a_"},{"name":"opentelemetry","version":"2.7.1-rc2"},{"name":"report-info","version":"1.2"},{"name":"rest-list-parameter","version":"1.6.0"},{"name":"robot","version":"3.2.0"},{"name":"role-strategy","version":"488.v0634ce149b_8c"},{"name":"saml","version":"2.298.vc7a_2b_3958628"},{"name":"schedule-build","version":"301.vfdc555a_b_cf81"},{"name":"theme-manager","version":"1.4"},{"name":"theme-manager","version":"1.3"}]}
Still following the runbook procedure, the "sync script" on the update center VM was executed to make sure that all of these plugin releases are synchronized to the reference mirror, and are available through the get.jenkins.io URL.
Sanity checking: https://get.jenkins.io/plugins/azure-vm-agents/815.vf2f07da070ee/azure-vm-agents.hpi was HTTP/404 right before this operation, and is now available.
Reason: the update-center job is succeeding since today 07:24am UTC, but it only covers the past few hours, hence the gap.
Email + IRC notification done
Old secrets.zip
credential removed in trusted.ci (I got a local encrypted backup)
update-center
readme (https://github.com/jenkins-infra/update-center/pull/5) additionnaly to the new runbook.update-center
private repo (as only admin members are allowed).Closing: the event is in the jenkins-infra-team calendar
Some files are still signed with the old (expired certificates):
signature check failed for http://updates.jenkins.io/updates/hudson.plugins.groovy.GroovyInstaller.json
ERROR: Signature verification failed in downloadable 'hudson.plugins.groovy.GroovyInstaller' <a href='#' class='showDetails'>(show details)</a><pre style='display:none'>java.security.cert.CertificateExpiredException: NotAfter: Tue Jun 14 13:42:00 EEST 2022<br>
also
signature check failed for https://updates.jenkins.io/updates/org.jenkinsci.plugins.scriptler.CentralScriptJsonCatalog.json
ERROR: Signature verification failed in downloadable 'org.jenkinsci.plugins.scriptler.CentralScriptJsonCatalog' <a href='#' class='showDetails'>(show details)</a><pre style='display:none'>java.security.cert.CertificateExpiredException: NotAfter: Tue Jun 14 13:42:00 EEST 2022<br>
The problem there is that these haven't been updated in a while (3 and 11 months respectively). Nothing to do with the cert, just a regular crawler failure.
Some other stuff was last updated 2017 🤷
The problem there is that these haven't been updated in a while (3 and 11 months respectively). Nothing to do with the cert, just a regular crawler failure.
Some other stuff was last updated 2017 🤷
Would that be a problem if find a way to regenerate these?
shouldn't be, often crawler depends on external websites which change their markup so it can get broken easily...
🤔 can we, instead, re-sign them one time? (still a high level question, haven't checked how to do it technically yet)
🤔 can we, instead, re-sign them one time? (still a high level question, haven't checked how to do it technically yet)
It's been done before, that's what KK did last time a bunch of them expired and he didn't have time to fix the scripts. I think he either replayed the jobs or hacked the scripts to resign the existing ones in some way
Thanks folks. Keeping this issue open, adding to the current milestone so we'll track it.
FYI every Jenkins instance will be complaining about this when it tries to check for updates
(It seems plugin updates still work at least)
@dduportal and I think we found the issue, executing the trusted.ci job on only groovy.groovy (as it's one of the failing signs), we noticed an error when it try to fetch the html to retrieve the data:
12:00:44 + export JENKINS_SIGNER=-key /update-center.key -certificate /update-center.cert -root-certificate ****/jenkins-update-center-root-ca.crt 12:00:44 + groovy -Dgrape.config=./grapeConfig.xml ./lib/runner.groovy groovy.groovy 12:02:22 loading dependencies...done 12:02:22 Caught: com.gargoylesoftware.htmlunit.ScriptException: ReferenceError: "fetch" is not defined. (https://groovy.jfrog.io/ui/externals/systemjs/dist/s.min.js#1) 12:02:22 ======= EXCEPTION START ======== 12:02:22 EcmaError: lineNumber=[1] column=[0] lineSource=[
] name=[ReferenceError] sourceName=[https://groovy.jfrog.io/ui/externals/systemjs/dist/s.min.js] message=[ReferenceError: "fetch" is not defined. (https://groovy.jfrog.io/ui/externals/systemjs/dist/s.min.js#1)] 12:02:22 com.gargoylesoftware.htmlunit.ScriptException: ReferenceError: "fetch" is not defined. (https://groovy.jfrog.io/ui/externals/systemjs/dist/s.min.js#1) 12:02:22 at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:883)
So, for each of these errors, the script can't generate the json nor sign it, and thus there isn't any new version of these files to rsync.
every Jenkins instance
… with affected plugins installed?
We split the work, the faster would win :)
… with affected plugins installed?
Not sure, weekly.ci.jenkins.io is affected and that doesn't have many plugins. (groovy plugin I think is there for some reason)
every Jenkins instance
… with affected plugins installed?
By looking at the files dates in /var/www/updates.jenkins.io/updates, we can see the affected plugins:
-rw-rw-r-- 1 www-data www-data 8940 Jun 2 2017 hudson.plugins.flyway.FlywayInstaller.json.html
-rw-rw-r-- 1 www-data www-data 25462 Jun 25 2018 org.jenkinsci.plugins.perlinstaller.PerlInstaller.json.html
-rw-rw-r-- 1 www-data www-data 23641 Jul 5 2021 hudson.plugins.groovy.GroovyInstaller.json.html
-rw-rw-r-- 1 www-data www-data 8752 Jul 27 2021 io.jenkins.plugins.codeql.CodeQLInstaller.json.html
-rw-rw-r-- 1 www-data www-data 34669 Mar 12 12:24 org.jenkinsci.plugins.scriptler.CentralScriptJsonCatalog.json.html
@dduportal has prepared a signer.groovy script to sign them.
We pushed the new signed files for them, it resolved the message error in Jenkins instances.
Long term solution: fix all failing groovy scripts in https://github.com/jenkins-infra/crawler
We used the script from https://github.com/jenkins-infra/crawler/pull/118 to manually re-generate all the metadata tools.
In the future, this script might be called on trusted.ci with a "replay" job
Fix for the "groovy" tools installer: https://github.com/jenkins-infra/crawler/pull/117 . That should update the current list (blocked to 3.0.8 to 3.0.11): https://github.com/jenkins-infra/crawler/pull/117
Closing the incident: all metadata files are now signed with latest as per our testing (and groovy was fixed).
Please feel free to reopen with details if you have any other error.
Service(s)
trusted.ci.jenkins.io updates.jenkins.io
Summary
The certificate of the update center expires soon (14th of June 2022) and must be renewed for 1 year. It forbids updates of the update-center JSON and tools metadata ("crawler").
[initial message] The pipeline job
crawler
on trusted.ci.jenkins.io is failing on its principal branch since 5 days, which means that the tools metadata are not published since these 5 days:cp: cannot stat 'target/*.html': No such file or directory
during thesh 'cp <...>
step - https://github.com/jenkins-infra/crawler/blob/master/Jenkinsfile#L65It seems that there are no HTML files generated in the
target/
directory while there should be.Reproduction steps
No response