jenkins-infra / helpdesk

Open your Infrastructure related issues here for the Jenkins project
https://github.com/jenkins-infra/helpdesk/issues/new/choose
17 stars 10 forks source link

Artifact downloads failed on agent using repo cache #3203

Closed MarkEWaite closed 1 year ago

MarkEWaite commented 1 year ago

Service(s)

ci.jenkins.io

Summary

https://ci.jenkins.io/job/Plugins/job/embeddable-build-status-plugin/job/PR-123/1/console shows that the artifact download failed from the caching repository.

Reproduction steps

  1. Configure the repository to use the caching repository
  2. Run a build with the caching repository
  3. Confirm the build fails because it is unable to download the artifacts as in this console
lemeurherve commented 1 year ago

We've got some issues restoring the service after the eks-public cluster upgrade to Kubernetes 1.23, I'm removing the 'aws' provider for now, this should help while we're fixing it.

dduportal commented 1 year ago

So we did a loooot of fixes and improvements for this service and it is now back online:

dduportal commented 1 year ago

Monitoring had been enabled in https://github.com/jenkins-infra/datadog/pull/113 (thanks a lot @lemeurherve @smerle33 !!).

Closing the issue as the next steps are tracked in https://github.com/jenkins-infra/helpdesk/issues/2752

MarkEWaite commented 1 year ago

I see a failure today in a git client plugin pull request build that reports an issue downloading from repo.jenkins-ci.org. I believe that is unrelated to this failure and is actually a failure of repo.jenkins-ci.org .

The message is:

07:40:24  [ERROR] Failed to execute goal on project git-client: 
Could not resolve dependencies for project org.jenkins-ci.plugins:git-client:hpi:3.12.2-rc3266.3615b_2e9568f: 
Failed to collect dependencies at org.jenkins-ci.main:jenkins-war:war:2.346.3 -> org.jenkins-ci.modules:launchd-slave-installer:jar:1.2: 
Failed to read artifact descriptor for org.jenkins-ci.modules:launchd-slave-installer:jar:1.2: 
Could not transfer artifact org.jenkins-ci:jenkins:pom:1.26 from/to repo.jenkins-ci.org (https://repo.jenkins-ci.org/public/): 
transfer failed for https://repo.jenkins-ci.org/public/org/jenkins-ci/jenkins/1.26/jenkins-1.26.pom: 
Connect to repo.jenkins-ci.org:443 [repo.jenkins-ci.org/35.243.130.159] failed: Connection timed out: connect -> [Help 1]

I've restarted that job to see if the problem is transient.

lemeurherve commented 1 year ago

Unrelated indeed, the artifact caching proxy isn't enabled on this repo.

Thanks for the report anyway, we'll still get occasional failures from repo.jenkins-ci.org, hope the proxy will alleviate some of them.

lemeurherve commented 1 year ago

With the latest fallback and settings improvements I've added to the pipeline library these builds shouldn't fail because of the artifact caching proxies anymore.

MarkEWaite commented 1 year ago

Failed the schedule build plugin on a pull request today on the Azure environment.

See https://ci.jenkins.io/job/Plugins/job/schedule-build-plugin/job/PR-177/1/console where it reports:

16:05:45  [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M7:test (default-test) on project schedule-build: Could not transfer artifact org.opentest4j:opentest4j:jar:1.1.1 from/to azure-proxy (https://repo.azure.jenkins.io/public/): transfer failed for https://repo.azure.jenkins.io/public/org/opentest4j/opentest4j/1.1.1/opentest4j-1.1.1.jar, status: 504 Gateway Time-out
16:05:45  [ERROR]   org.opentest4j:opentest4j:jar:1.1.1
16:05:45  [ERROR] 
16:05:45  [ERROR] from the specified remote repositories:
16:05:45  [ERROR]   azure-proxy (https://repo.azure.jenkins.io/public/, releases=true, snapshots=true),
16:05:45  [ERROR]   azure-proxy-incrementals (https://repo.azure.jenkins.io/incrementals/, releases=true, snapshots=false)
16:05:45  [ERROR] Path to dependency: 
16:05:45  [ERROR]   1) org.apache.maven.surefire:surefire-junit-platform:jar:3.0.0-M7
16:05:45  [ERROR]   2) org.junit.platform:junit-platform-launcher:jar:1.3.2
16:05:45  [ERROR]   3) org.junit.platform:junit-platform-engine:jar:1.3.2
16:05:45  [ERROR]   4) org.opentest4j:opentest4j:jar:1.1.1
16:05:45  [ERROR] -> [Help 1]
16:05:45  org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M7:test (default-test) on project schedule-build: Could not transfer artifact org.opentest4j:opentest4j:jar:1.1.1 from/to azure-proxy (https://repo.azure.jenkins.io/public/): transfer failed for https://repo.azure.jenkins.io/public/org/opentest4j/opentest4j/1.1.1/opentest4j-1.1.1.jar, status: 504 Gateway Time-out
16:05:45    org.opentest4j:opentest4j:jar:1.1.1
16:05:45  
16:05:45  from the specified remote repositories:
16:05:45    azure-proxy (https://repo.azure.jenkins.io/public/, releases=true, snapshots=true),
16:05:45    azure-proxy-incrementals (https://repo.azure.jenkins.io/incrementals/, releases=true, snapshots=false)

Similar problem on https://ci.jenkins.io/job/Plugins/job/priority-sorter-plugin/job/PR-205/1/console

16:03:52  [ERROR] Plugin org.codehaus.mojo:flatten-maven-plugin:1.3.0 or one of its dependencies could not be resolved: org.codehaus.mojo:flatten-maven-plugin:jar:1.3.0 failed to transfer from https://repo.azure.jenkins.io/public/ during a previous attempt. This failure was cached in the local repository and resolution is not reattempted until the update interval of azure-proxy has elapsed or updates are forced. Original error: Could not transfer artifact org.codehaus.mojo:flatten-maven-plugin:jar:1.3.0 from/to azure-proxy (https://repo.azure.jenkins.io/public/): transfer failed for https://repo.azure.jenkins.io/public/org/codehaus/mojo/flatten-maven-plugin/1.3.0/flatten-maven-plugin-1.3.0.jar, status: 504 Gateway Time-out -> [Help 1]
[35](https://ci.jenkins.io/job/Plugins/job/priority-sorter-plugin/job/PR-205/1/pipeline-console/?selected-node=146#log-35)
16:03:52  org.apache.maven.plugin.PluginResolutionException: Plugin org.codehaus.mojo:flatten-maven-plugin:1.3.0 or one of its dependencies could not be resolved: org.codehaus.mojo:flatten-maven-plugin:jar:1.3.0 failed to transfer from https://repo.azure.jenkins.io/public/ during a previous attempt. This failure was cached in the local repository and resolution is not reattempted until the update interval of azure-proxy has elapsed or updates are forced. Original error: Could not transfer artifact org.codehaus.mojo:flatten-maven-plugin:jar:1.3.0 from/to azure-proxy (https://repo.azure.jenkins.io/public/): transfer failed for https://repo.azure.jenkins.io/public/org/codehaus/mojo/flatten-maven-plugin/1.3.0/flatten-maven-plugin-1.3.0.jar, status: 504 Gateway Time-out
[36](https://ci.jenkins.io/job/Plugins/job/priority-sorter-plugin/job/PR-205/1/pipeline-console/?selected-node=146#log-36)
16:03:52      at org.apache.maven.plugin.internal.DefaultPluginDependenciesResolver.resolve (DefaultPluginDependenciesResolver.java:144)

Similar problem with https://ci.jenkins.io/job/Plugins/job/implied-labels-plugin/job/PR-74/1/console

16:03:55  [ERROR] Plugin org.codehaus.mojo:flatten-maven-plugin:1.3.0 or one of its dependencies could not be resolved: org.codehaus.mojo:flatten-maven-plugin:jar:1.3.0 failed to transfer from https://repo.azure.jenkins.io/public/ during a previous attempt. This failure was cached in the local repository and resolution is not reattempted until the update interval of azure-proxy has elapsed or updates are forced. Original error: Could not transfer artifact org.codehaus.mojo:flatten-maven-plugin:jar:1.3.0 from/to azure-proxy (https://repo.azure.jenkins.io/public/): transfer failed for https://repo.azure.jenkins.io/public/org/codehaus/mojo/flatten-maven-plugin/1.3.0/flatten-maven-plugin-1.3.0.jar, status: 504 Gateway Time-out -> [Help 1]
16:03:55  org.apache.maven.plugin.PluginResolutionException: Plugin org.codehaus.mojo:flatten-maven-plugin:1.3.0 or one of its dependencies could not be resolved: org.codehaus.mojo:flatten-maven-plugin:jar:1.3.0 failed to transfer from https://repo.azure.jenkins.io/public/ during a previous attempt. This failure was cached in the local repository and resolution is not reattempted until the update interval of azure-proxy has elapsed or updates are forced. Original error: Could not transfer artifact org.codehaus.mojo:flatten-maven-plugin:jar:1.3.0 from/to azure-proxy (https://repo.azure.jenkins.io/public/): transfer failed for https://repo.azure.jenkins.io/public/org/codehaus/mojo/flatten-maven-plugin/1.3.0/flatten-maven-plugin-1.3.0.jar, status: 504 Gateway Time-out
16:03:55      at org.apache.maven.plugin.internal.DefaultPluginDependenciesResolver.resolve (DefaultPluginDependenciesResolver.java:144)
lemeurherve commented 1 year ago

@MarkEWaite As it seems the same case as described here, WDYT of (re)closing this issue and keep only the other one opened while I'm working on #2844?

MarkEWaite commented 1 year ago

@MarkEWaite As it seems the same case as described here, WDYT of (re)closing this issue and keep only the other one opened while I'm working on #2844?

Sounds good to me. I didn't realize it was the same case. Thanks for checking!

lemeurherve commented 1 year ago

Thanks for the reports! I've renamed #3221 to be more explicit about the fact that's the Azure artifact caching proxy provider which is sometimes failing for now.