jenkins-infra / helpdesk

Open your Infrastructure related issues here for the Jenkins project
https://github.com/jenkins-infra/helpdesk/issues/new/choose
16 stars 10 forks source link

ci.jenkins.io incorrectly shows build dates in Dec 1969 (the epoch) #3934

Closed MarkEWaite closed 7 months ago

MarkEWaite commented 7 months ago

Service(s)

ci.jenkins.io

Summary

Build dates for the ci.jenkins.io packaging test job shows Dec 1969 (the epoch in my time zone)

Reproduction steps

  1. Open the packaging job at https://ci.jenkins.io/job/Packaging/job/packaging/job/master/
  2. Confirm that the build dates include Dec 31, 1969 or Jan 1, 1970 like this screenshot screencapture-ci-jenkins-io-job-Packaging-job-packaging-job-master-2024-02-05-13_51_46-edit

I've seen recent reports of date related errors on Jenkins controllers, but could not find the references to link. I assumed that the date related error reports were associated with incorrect operating system date settings on the controller, but I'm reasonably confident that the issue on ci.jenkins.io is not related to incorrect operating system date.

Reports of a similar issue include:

dduportal commented 7 months ago

Oh that explains a lot! I did not pay enough attention but this is the (indirect) root cause of https://github.com/jenkins-infra/helpdesk/issues/3933.

For info, it happens after a set of plugins upgrades on ci.jenkins.io + VM restart last sunday (4th). trusted.ci and cert.ci (same kind of controllers, same LTS, a really close plugin set + VM) timeline did not show the same behavior (hopefully).

Note that ci.jenkins.io did break during the restart due to a Datadog stack overflow error (related to https://github.com/jenkinsci/datadog-plugin/blob/master/CHANGELOG.md#600--2024-01-31)

Capture d’écran 2024-02-05 à 09 13 37

=> this could explain the corruption but I don't have any real proof.

dduportal commented 7 months ago

As a matter of safety, let's take a snapshot of all our controllers Jenkins home before proceeding with further upgrades (ref. https://github.com/jenkins-infra/kubernetes-management/pull/4941 and https://github.com/jenkins-infra/kubernetes-management/pull/4942)

dduportal commented 7 months ago

As a matter of safety, let's take a snapshot of all our controllers Jenkins home before proceeding with further upgrades (ref. jenkins-infra/kubernetes-management#4941 and jenkins-infra/kubernetes-management#4942)

Done for all controllers

OrangeDog commented 7 months ago

https://issues.jenkins.io/browse/JENKINS-66328

You also have the same (probably unrelated) additional problem of the labels incorrectly showing a daylight savings time.

lemeurherve commented 7 months ago

Another stackoverflow today, corresponding stack trace: https://gist.github.com/lemeurherve/eaf08053a133733a829c24ca0eec7117

Resolved by a docker restart.

Reported on https://github.com/jenkinsci/datadog-plugin/issues/389 too.

nikita-tkachenko-datadog commented 7 months ago

The incorrect dates display is likely related to https://github.com/jenkinsci/datadog-plugin/issues/393 in Jenkins Datadog plugin. It should be resolved in the plugin v6.0.1. The stack overflow is a separate issue that will be fixed in the upcoming plugin release.

lemeurherve commented 7 months ago

Hello @nikita-tkachenko-datadog, I've responded in your issue at https://github.com/jenkinsci/datadog-plugin/issues/393#issuecomment-1932717351, we still have previous builds appearing as failed in 1970 on ci.jenkins.io even with the 6.0.1 version.

Thanks for the quick stack overflow fix!

nikita-tkachenko-datadog commented 7 months ago

https://issues.jenkins.io/browse/JENKINS-66328 describes a similar issue. Some of the reports in there are from 25/01/202 (which is before Datadog plugin v6.0.0 was released) and the reporters claim that they're not using the Datadog plugin.

So while there is a plugin data deserialisation problem in Datadog v6.0.0, it is possible that the date/status display issue is caused by something else.

I apologise if my previous comment caused confusion.

dduportal commented 7 months ago

https://issues.jenkins.io/browse/JENKINS-66328 describes a similar issue. Some of the reports in there are from 25/01/202 (which is before Datadog plugin v6.0.0 was released) and the reporters claim that they're not using the Datadog plugin.

So while there is a plugin data deserialisation problem in Datadog v6.0.0, it is possible that the date/status display issue is caused by something else.

I apologise if my previous comment caused confusion.

No worries, at least we are all in sync. Our goal was to share the knowledge so we can all build a better Jenkins ecosystem!

dduportal commented 7 months ago

The stack overflow is a separate issue that will be fixed in the upcoming plugin release.

FYI, another stack overflow on ci.jenkins.io while restarting the controller for a Docker-CE upgrade. A restart did the trick with no further corruption (cc @smerle33 for info)

OrangeDog commented 7 months ago

Some of the reports in there are from 25/01/202 (which is before Datadog plugin v6.0.0 was released) and the reporters claim that they're not using the Datadog plugin.

Indeed. I have never had the Datadog plugin installed, and I have pinned the problem as starting for me at some point between the 5th and the 19th of January. Further, downgrading Jenkins core fixes it.

MarkEWaite commented 7 months ago

Indeed. I have never had the Datadog plugin installed, and I have pinned the problem as starting for me at some point between the 5th and the 19th of January. Further, downgrading Jenkins core fixes it.

Can you provide the detailed plugin lists in the "before" and "after" state? I suspect it is more likely related to a plugin update than to anything that changed in Jenkins core.

OrangeDog commented 7 months ago

Can you provide the detailed plugin lists in the "before" and "after" state?

I already provided it on the linked issue.

I suspect it is more likely related to a plugin update than to anything that changed in Jenkins core.

Then why is it an update to Jenkins core that causes it?

OrangeDog commented 7 months ago

This listing can be used to narrow down the plugins that may have been upgraded before the problem started. As detailed on the linked issue, I already investigated joda-time-api.

Jan 25 10:10 workflow-durable-task-step.jpi
Jan 25 10:10 pipeline-maven.jpi
Jan 25 10:10 pipeline-maven-api.jpi
Jan 25 10:10 matrix-project.jpi
Jan 25 10:10 git-server.jpi
Jan 25 10:10 coverage.jpi
Jan 24 10:21 pipeline-groovy-lib.jpi
Jan 24 10:21 workflow-cps.jpi
Jan 24 10:21 config-file-provider.jpi
Jan 23 10:42 pipeline-model-definition.jpi
Jan 23 10:42 pipeline-stage-tags-metadata.jpi
Jan 23 10:42 pipeline-model-extensions.jpi
Jan 23 10:42 pipeline-model-api.jpi
Jan 23 10:42 pipeline-utility-steps.jpi
Jan 23 10:42 json-path-api.jpi
Jan 22 09:06 font-awesome-api.jpi
Jan 22 09:06 credentials-binding.jpi
Jan 19 12:09 tap.jpi
Jan 19 12:09 workflow-step-api.jpi
Jan 19 12:09 mina-sshd-api-core.jpi
Jan 19 12:09 mina-sshd-api-common.jpi
Jan 18 09:38 structs.jpi
Jan 18 09:38 sshd.jpi
Jan 18 09:38 workflow-multibranch.jpi
Jan 18 09:38 lockable-resources.jpi
Jan 18 09:38 github-branch-source.jpi
Jan 17 09:24 email-ext.jpi
Jan 16 13:04 ssh-slaves.jpi
Jan 16 13:04 junit.jpi
Jan 16 13:04 cloudbees-bitbucket-branch-source.jpi
Jan 15 10:18 analysis-model-api.jpi
Jan 15 10:18 joda-time-api.jpi
Jan 11 13:24 warnings-ng.jpi
Jan  9 09:27 script-security.jpi
Jan  9 09:27 gson-api.jpi
Jan  9 09:27 durable-task.jpi
Jan  9 09:27 branch-api.jpi
Jan  4 14:13 jsch.jpi
MarkEWaite commented 7 months ago

Then why is it an update to Jenkins core that causes it?

I assume that it is not an update to Jenkins core that causes it. As far as I understand it, you updated Jenkins core and Jenkins plugins at the same time. The bad behavior on ci.jenkins.io seems to be due to the datadog plugin update. I assume in your environment there is some other plugin update that is causing unexpected behavior.

dduportal commented 7 months ago

If I may, can we split subjects? This issue tracker is not aimed at solving problems for user-related Jenkins installations, but only for the jenkins.io Jenkins installations.

@OrangeDog you are correct, in your case it is something else than the datadog plugin that caused the build history corruption. As underlined by Nikita in https://github.com/jenkinsci/datadog-plugin/issues/393#issuecomment-1933815940, there might be a weird behavior in the Core itself during the recovery process which corrupts part of the build records.

But it can be caused by many things including a plugin upgrade and / or resource issues (hard restart during JENKINS_HOME scanning, memory issue, any StackOverFlow error, etc.).

But it has to be tracked in the JIRA issue with mentions to Nikita's comment (if not already the case): there is nothing that the Jenkins infrastructure team can do here to help, except reporting and raising the concern to the community.

MarkEWaite commented 7 months ago

If I may, can we split subjects? This issue tracker is not aimed at solving problems for user-related Jenkins installations, but only for the jenkins.io Jenkins installations.

That sounds good to me. This issue is tracking the problem as encountered on ci.jenkins.io. JENKINS-66328 is tracking the report from @OrangeDog and others.

OrangeDog commented 7 months ago

@MarkEWaite except it is fixed by downgrading core, not by downgrading any plugins.

Edit: it's not, it just made it much harder to reproduce

dduportal commented 7 months ago

Going to close this issue as we've secured the datadog plugin: it is using the 6.0.2 version everywhere:

The ci.jenkins.io builds are going to be cleaned up over time as we have global build discarding.