jenkins-infra / helpdesk

Open your Infrastructure related issues here for the Jenkins project
https://github.com/jenkins-infra/helpdesk/issues/new/choose
17 stars 10 forks source link

Artifact caching proxy is unreliable #3481

Closed basil closed 1 year ago

basil commented 1 year ago

Service(s)

Artifact-caching-proxy

Summary

Despite https://github.com/jenkins-infra/helpdesk/issues/2752#issuecomment-1490215215, https://ci.jenkins.io/job/Tools/job/bom/job/master/1568/ failed after approximately 90 minutes with

[2023-03-30T21:32:37.394Z] [ERROR] Failed to execute goal on project script-security: Could not resolve dependencies for project org.jenkins-ci.plugins:script-security:hpi:999999-SNAPSHOT: Could not transfer artifact org.jenkins-ci.main:jenkins-war:war:2.346.1 from/to do-proxy (https://repo.do.jenkins.io/public/): Premature end of Content-Length delimited message body (expected: 91,193,351; received: 45,782,202) -> [Help 1]
[2023-03-30T21:32:37.394Z] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal on project script-security: Could not resolve dependencies for project org.jenkins-ci.plugins:script-security:hpi:999999-SNAPSHOT: Could not transfer artifact org.jenkins-ci.main:jenkins-war:war:2.346.1 from/to do-proxy (https://repo.do.jenkins.io/public/): Premature end of Content-Length delimited message body (expected: 91,193,351; received: 45,782,202)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies (LifecycleDependencyResolver.java:245)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies (LifecycleDependencyResolver.java:138)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved (MojoExecutor.java:369)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:327)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:213)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:175)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExecutor.java:76)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor$1.run (MojoExecutor.java:163)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute (DefaultMojosExecutionStrategy.java:39)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:160)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:105)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:73)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:53)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:118)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:260)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:172)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:100)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.cli.MavenCli.execute (MavenCli.java:821)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:270)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
[2023-03-30T21:32:37.394Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-03-30T21:32:37.394Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-03-30T21:32:37.394Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-03-30T21:32:37.394Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-03-30T21:32:37.394Z] Caused by: org.apache.maven.project.DependencyResolutionException: Could not resolve dependencies for project org.jenkins-ci.plugins:script-security:hpi:999999-SNAPSHOT: Could not transfer artifact org.jenkins-ci.main:jenkins-war:war:2.346.1 from/to do-proxy (https://repo.do.jenkins.io/public/): Premature end of Content-Length delimited message body (expected: 91,193,351; received: 45,782,202)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.project.DefaultProjectDependenciesResolver.resolve (DefaultProjectDependenciesResolver.java:187)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies (LifecycleDependencyResolver.java:226)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies (LifecycleDependencyResolver.java:138)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved (MojoExecutor.java:369)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:327)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:213)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:175)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExecutor.java:76)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor$1.run (MojoExecutor.java:163)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute (DefaultMojosExecutionStrategy.java:39)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:160)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:105)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:73)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:53)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:118)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:260)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:172)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:100)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.cli.MavenCli.execute (MavenCli.java:821)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:270)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
[2023-03-30T21:32:37.394Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-03-30T21:32:37.394Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-03-30T21:32:37.394Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-03-30T21:32:37.394Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-03-30T21:32:37.394Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-03-30T21:32:37.394Z] Caused by: org.eclipse.aether.resolution.DependencyResolutionException: Could not transfer artifact org.jenkins-ci.main:jenkins-war:war:2.346.1 from/to do-proxy (https://repo.do.jenkins.io/public/): Premature end of Content-Length delimited message body (expected: 91,193,351; received: 45,782,202)
[2023-03-30T21:32:37.394Z]     at org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies (DefaultRepositorySystem.java:402)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.project.DefaultProjectDependenciesResolver.resolve (DefaultProjectDependenciesResolver.java:182)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies (LifecycleDependencyResolver.java:226)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies (LifecycleDependencyResolver.java:138)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved (MojoExecutor.java:369)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:327)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:213)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:175)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExecutor.java:76)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.lifecycle.internal.MojoExecutor$1.run (MojoExecutor.java:163)
[2023-03-30T21:32:37.394Z]     at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute (DefaultMojosExecutionStrategy.java:39)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:160)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:105)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:73)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:53)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:118)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:260)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:172)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:100)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.cli.MavenCli.execute (MavenCli.java:821)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:270)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
[2023-03-30T21:32:37.395Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-03-30T21:32:37.395Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-03-30T21:32:37.395Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-03-30T21:32:37.395Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-03-30T21:32:37.395Z] Caused by: org.eclipse.aether.resolution.ArtifactResolutionException: Could not transfer artifact org.jenkins-ci.main:jenkins-war:war:2.346.1 from/to do-proxy (https://repo.do.jenkins.io/public/): Premature end of Content-Length delimited message body (expected: 91,193,351; received: 45,782,202)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve (DefaultArtifactResolver.java:506)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts (DefaultArtifactResolver.java:278)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies (DefaultRepositorySystem.java:385)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.project.DefaultProjectDependenciesResolver.resolve (DefaultProjectDependenciesResolver.java:182)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies (LifecycleDependencyResolver.java:226)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies (LifecycleDependencyResolver.java:138)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved (MojoExecutor.java:369)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:327)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:213)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:175)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExecutor.java:76)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor$1.run (MojoExecutor.java:163)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute (DefaultMojosExecutionStrategy.java:39)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:160)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:105)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:73)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:53)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:118)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:260)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:172)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:100)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.cli.MavenCli.execute (MavenCli.java:821)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:270)
[2023-03-30T21:32:37.395Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
[2023-03-30T21:32:37.395Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-03-30T21:32:37.395Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-03-30T21:32:37.395Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-03-30T21:32:37.395Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-03-30T21:32:37.395Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-03-30T21:32:37.395Z] Caused by: org.eclipse.aether.transfer.ArtifactTransferException: Could not transfer artifact org.jenkins-ci.main:jenkins-war:war:2.346.1 from/to do-proxy (https://repo.do.jenkins.io/public/): Premature end of Content-Length delimited message body (expected: 91,193,351; received: 45,782,202)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.connector.basic.ArtifactTransportListener.transferFailed (ArtifactTransportListener.java:52)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run (BasicRepositoryConnector.java:388)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.util.concurrency.RunnableErrorForwarder.lambda$wrap$0 (RunnableErrorForwarder.java:73)
[2023-03-30T21:32:37.395Z]     at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1128)
[2023-03-30T21:32:37.395Z]     at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:628)
[2023-03-30T21:32:37.395Z]     at java.lang.Thread.run (Thread.java:829)
[2023-03-30T21:32:37.395Z] Caused by: org.apache.http.ConnectionClosedException: Premature end of Content-Length delimited message body (expected: 91,193,351; received: 45,782,202)
[2023-03-30T21:32:37.395Z]     at org.apache.http.impl.io.ContentLengthInputStream.read (ContentLengthInputStream.java:178)
[2023-03-30T21:32:37.395Z]     at org.apache.http.conn.EofSensorInputStream.read (EofSensorInputStream.java:135)
[2023-03-30T21:32:37.395Z]     at org.apache.http.conn.EofSensorInputStream.read (EofSensorInputStream.java:148)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.copy (AbstractTransporter.java:193)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.utilGet (AbstractTransporter.java:106)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.transport.http.HttpTransporter.access$200 (HttpTransporter.java:95)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.transport.http.HttpTransporter$EntityGetter.handle (HttpTransporter.java:626)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.transport.http.HttpTransporter.execute (HttpTransporter.java:371)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.transport.http.HttpTransporter.implGet (HttpTransporter.java:298)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.get (AbstractTransporter.java:72)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$GetTaskRunner.runTask (BasicRepositoryConnector.java:466)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run (BasicRepositoryConnector.java:383)
[2023-03-30T21:32:37.395Z]     at org.eclipse.aether.util.concurrency.RunnableErrorForwarder.lambda$wrap$0 (RunnableErrorForwarder.java:73)
[2023-03-30T21:32:37.395Z]     at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1128)
[2023-03-30T21:32:37.395Z]     at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:628)
[2023-03-30T21:32:37.395Z]     at java.lang.Thread.run (Thread.java:829)
[2023-03-30T21:32:37.395Z] [ERROR] 
[2023-03-30T21:32:37.395Z] [ERROR] Re-run Maven using the -X switch to enable full debug logging.
[2023-03-30T21:32:37.395Z] [ERROR] 
[2023-03-30T21:32:37.395Z] [ERROR] For more information about the errors and possible solutions, please read the following articles:
[2023-03-30T21:32:37.395Z] [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[2023-03-30T21:32:37.395Z] org.jenkins.tools.test.exception.PomExecutionException: mvn -B -V -e -ntp -s /home/jenkins/agent/workspace/Tools_bom_master@tmp/config13736734992442172432tmp -Dmaven.javadoc.skip=true clean process-test-classes in /home/jenkins/agent/workspace/Tools_bom_master/pct-work/script-security failed with exit status 1
[2023-03-30T21:32:37.395Z]  at org.jenkins.tools.test.maven.ExternalMavenRunner.run(ExternalMavenRunner.java:106)
[2023-03-30T21:32:37.395Z]  at org.jenkins.tools.test.PluginCompatTester.testPluginAgainst(PluginCompatTester.java:381)
[2023-03-30T21:32:37.395Z]  at org.jenkins.tools.test.PluginCompatTester.testPlugins(PluginCompatTester.java:182)
[2023-03-30T21:32:37.395Z]  at org.jenkins.tools.test.PluginCompatTesterCli.call(PluginCompatTesterCli.java:198)
[2023-03-30T21:32:37.395Z]  at org.jenkins.tools.test.PluginCompatTesterCli.call(PluginCompatTesterCli.java:43)
[2023-03-30T21:32:37.395Z]  at picocli.CommandLine.executeUserObject(CommandLine.java:2041)
[2023-03-30T21:32:37.395Z]  at picocli.CommandLine.access$1500(CommandLine.java:148)
[2023-03-30T21:32:37.396Z]  at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461)
[2023-03-30T21:32:37.396Z]  at picocli.CommandLine$RunLast.handle(CommandLine.java:2453)
[2023-03-30T21:32:37.396Z]  at picocli.CommandLine$RunLast.handle(CommandLine.java:2415)
[2023-03-30T21:32:37.396Z]  at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273)
[2023-03-30T21:32:37.396Z]  at picocli.CommandLine$RunLast.execute(CommandLine.java:2417)
[2023-03-30T21:32:37.396Z]  at picocli.CommandLine.execute(CommandLine.java:2170)
[2023-03-30T21:32:37.396Z]  at org.jenkins.tools.test.PluginCompatTesterCli.main(PluginCompatTesterCli.java:203)
script returned exit code 1

Reproduction steps

Run a lot of jenkinsci/bom builds without the skip-artifact-caching-proxy label and you are bound to see it within a few attempts.

dduportal commented 1 year ago

Quick report of the actions performed Friday 30 of March by the Jenkins Infra team:

=> there is no "easy and obvious" cause

I heard @lemeurherve mentioning the builds were back to normal despite https://github.com/jenkinsci/bom/pull/1921 but I'm not sure: I'll let him report here monday (as the current main build of the BOM seems to succeed: we can wait for the week-end to pass).

Next steps:

basil commented 1 year ago

there is no "easy and obvious" cause

Did we consider the possibility of file descriptor exhaustion? Not sure what RLIMIT_NOFILE is on the nginx(8) process, but worker_connections (which is bounded by RLIMIT_NOFILE) is set to 1024. Maven's aether.connector.basic.threads defaults to 5, meaning that a proxy with 1024 workers could handle ~200 simultaneous Maven builds. That seems to be in the same order of magnitude as the number of concurrent PCT jobs we run. See https://nginx.org/en/docs/http/ngx_http_stub_status_module.html for how to get this information from nginx(8), or simply run ls -1 /proc/${PID}/fd | wc -l against the nginx(8) worker (not primary!) process during a busy BOM build for a crude approximation. If file descriptors are exhausted you could increase RLIMIT_NOFILE and/or worker_connections, or apply backpressure by passing e.g. -Daether.connector.basic.threads=2 to Maven to halve the number of concurrent requests.

I heard @lemeurherve mentioning the builds were back to normal despite jenkinsci/bom#1921 but I'm not sure: I'll let him report here monday (as the current main build of the BOM seems to succeed: we can wait for the week-end to pass).

I did not see any report today, but I am assuming the reasoning is that the Maven upgrade restored retries and therefore stability. That is possible; it could be confirmed by reverting jenkinsci/bom#1921 and seeing if the problem returns. I have never seen a failure with the old version of Maven and repo.jenkins-ci.org (Artifactory) though, only with the artifact caching proxy. In any case, the Maven retry behavior is simply a workaround on top of the real problem: it should not be necessary to retry downloads in the first place. They should suceed on the first try. To phrase this another way, the artifact caching proxy is a server built and hosted by us entirely for our own needs (and not accessible to anyone else), so we ought to be able to engineer it to meet our needs perfectly on the first try. This can be achieved using suitable hardware & networking resources combined with OS and userland configuration, either scaling horizontally (e.g., multiple proxies with a load balancer) or vertically (e.g., a single proxy with sufficient resources to meet demand).

Closing as the "unreliable" behavior (which is BOM-only) is tracked in #3481

A theory of file descriptor exhaustion is consistent with the pathology being only observable in BOM, as BOM is a pretty good stress test for the web server.

dduportal commented 1 year ago

I'm not sure about the RLIMIT_NOFILE : the worker_connections is set to 1024, that is true, but each ACP instance has 4 workers so 4096, and we have 2 ACP replicas per "cloud" with RR load balanced requests. Also, Nginx logs this error when it happens (faced it before many times) so I'm surprised that we did not see any error about this.

However, I realize we did NOT check for the ingress (which happen to be Nginx) configuration that might be the culprit (we do not hit ACP directly).

Anyway that could be worth it to:

dduportal commented 1 year ago

Also, there are fundamentals questions at stake:

dduportal commented 1 year ago

Please note that with https://github.com/jenkins-infra/helpdesk/issues/3487, investing too much time in DigitalOcean might be a waste as we are not sure if we'll get more credits.

dduportal commented 1 year ago

Metrics on the ACP side confirms that the 30th of March, the cumulated amount of active connections on the whole ACP for DigitalOcean never crossed the threshold of 400 simultaneously.

However, it seems there is a lot of connections "waiting" which goes into the direction of https://github.com/jenkins-infra/helm-charts/pull/471

Capture d’écran 2023-04-04 à 09 17 39
timja commented 1 year ago

Is it worth trying an artifactory instance running on each cloud provider as a proxy instead of nginx?

I assume nginx was used as that's what was used previously?

dduportal commented 1 year ago

Is it worth trying an artifactory instance running on each cloud provider as a proxy instead of nginx?

I assume nginx was used as that's what was used previously?

Yes, but not only:

I'm really bothered by this issue: there is nothing pointing finger at ACP specifically (otherwise the issue would have been seen in AWS as well since quite some time) so it really looks a weird issue.

timja commented 1 year ago

The maintenance of Artifactory is a lot of time to spend on. We can't compare to repo.jenkins-ci.org which has engineering team behind and a decade worth of tuning (and a cloud bill through the roof...).

We've ran it in a pod on Kubernetes for about 4 years and can count on one hand the number of times we've had to touch it. What maintenance are you thinking of?

dduportal commented 1 year ago

The maintenance of Artifactory is a lot of time to spend on. We can't compare to repo.jenkins-ci.org which has engineering team behind and a decade worth of tuning (and a cloud bill through the roof...).

We've ran it in a pod on Kubernetes for about 4 years and can count on one hand the number of times we've had to touch it. What maintenance are you thinking of?

these are the maintenance I'm thinking of.

(edit)

basil commented 1 year ago

I tried enabling the artifact caching proxy in ATH and got the same error, even with Maven 3.9.1. So Maven 3.9.1 is not the solution.

[2023-04-05T01:40:59.059Z] + mvn -B -V -e -ntp -s /home/jenkins/workspace/_acceptance-test-harness_PR-1097@tmp/config4855491279108139757tmp test -Dmaven.repo.local=/home/jenkins/workspace/_acceptance-test-harness_PR-1097@tmp/m2repo -Dmaven.test.failure.ignore=true -DforkCount=1
[2023-04-05T01:40:59.314Z] Apache Maven 3.9.1 (2e178502fcdbffc201671fb2537d0cb4b4cc58f8)
[2023-04-05T01:40:59.314Z] Maven home: /opt/maven
[2023-04-05T01:40:59.314Z] Java version: 11.0.18, vendor: Ubuntu, runtime: /usr/lib/jvm/java-11-openjdk-amd64
[2023-04-05T01:40:59.314Z] Default locale: en_US, platform encoding: ANSI_X3.4-1968
[2023-04-05T01:40:59.314Z] OS name: "linux", version: "5.15.0-1035-azure", arch: "amd64", family: "unix"
[2023-04-05T01:41:01.871Z] [ERROR] Error executing Maven.
[2023-04-05T01:41:01.871Z] org.apache.maven.cli.internal.ExtensionResolutionException: Extension io.jenkins.tools.incrementals:git-changelist-maven-extension:1.6 or one of its dependencies could not be resolved: The following artifacts could not be resolved: org.apache.httpcomponents:httpclient:jar:4.3.6 (absent): Could not transfer artifact org.apache.httpcomponents:httpclient:jar:4.3.6 from/to azure-proxy (https://repo.azure.jenkins.io/public/): Connection reset
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveExtension (BootstrapCoreExtensionManager.java:182)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveCoreExtensions (BootstrapCoreExtensionManager.java:115)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.loadCoreExtensions (BootstrapCoreExtensionManager.java:99)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.loadCoreExtensions (MavenCli.java:677)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.container (MavenCli.java:548)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:266)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:195)
[2023-04-05T01:41:01.871Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-04-05T01:41:01.871Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-04-05T01:41:01.871Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-04-05T01:41:01.871Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-04-05T01:41:01.871Z] Caused by: org.eclipse.aether.resolution.ArtifactResolutionException: The following artifacts could not be resolved: org.apache.httpcomponents:httpclient:jar:4.3.6 (absent): Could not transfer artifact org.apache.httpcomponents:httpclient:jar:4.3.6 from/to azure-proxy (https://repo.azure.jenkins.io/public/): Connection reset
[2023-04-05T01:41:01.871Z]     at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve (DefaultArtifactResolver.java:456)
[2023-04-05T01:41:01.871Z]     at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts (DefaultArtifactResolver.java:261)
[2023-04-05T01:41:01.871Z]     at org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies (DefaultRepositorySystem.java:352)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.plugin.internal.DefaultPluginDependenciesResolver.resolveInternal (DefaultPluginDependenciesResolver.java:212)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.plugin.internal.DefaultPluginDependenciesResolver.resolveCoreExtension (DefaultPluginDependenciesResolver.java:148)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveExtension (BootstrapCoreExtensionManager.java:174)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveCoreExtensions (BootstrapCoreExtensionManager.java:115)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.loadCoreExtensions (BootstrapCoreExtensionManager.java:99)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.loadCoreExtensions (MavenCli.java:677)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.container (MavenCli.java:548)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:266)
[2023-04-05T01:41:01.871Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:195)
[2023-04-05T01:41:01.871Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-04-05T01:41:01.871Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-04-05T01:41:01.871Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-04-05T01:41:01.871Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-04-05T01:41:01.871Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-04-05T01:41:01.871Z] Caused by: org.eclipse.aether.transfer.ArtifactTransferException: Could not transfer artifact org.apache.httpcomponents:httpclient:jar:4.3.6 from/to azure-proxy (https://repo.azure.jenkins.io/public/): Connection reset
[2023-04-05T01:41:01.871Z]     at org.eclipse.aether.connector.basic.ArtifactTransportListener.transferFailed (ArtifactTransportListener.java:44)
[2023-04-05T01:41:01.871Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run (BasicRepositoryConnector.java:417)
[2023-04-05T01:41:01.871Z]     at org.eclipse.aether.util.concurrency.RunnableErrorForwarder.lambda$wrap$0 (RunnableErrorForwarder.java:66)
[2023-04-05T01:41:01.871Z]     at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1128)
[2023-04-05T01:41:01.871Z]     at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:628)
[2023-04-05T01:41:01.871Z]     at java.lang.Thread.run (Thread.java:829)
[2023-04-05T01:41:01.871Z] Caused by: java.net.SocketException: Connection reset
[2023-04-05T01:41:01.871Z]     at java.net.SocketInputStream.read (SocketInputStream.java:186)
[2023-04-05T01:41:01.871Z]     at java.net.SocketInputStream.read (SocketInputStream.java:140)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLSocketInputRecord.read (SSLSocketInputRecord.java:484)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLSocketInputRecord.readFully (SSLSocketInputRecord.java:467)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLSocketInputRecord.decodeInputRecord (SSLSocketInputRecord.java:243)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLSocketInputRecord.decode (SSLSocketInputRecord.java:181)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLTransport.decode (SSLTransport.java:111)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLSocketImpl.decode (SSLSocketImpl.java:1510)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLSocketImpl.readApplicationRecord (SSLSocketImpl.java:1477)
[2023-04-05T01:41:01.871Z]     at sun.security.ssl.SSLSocketImpl$AppInputStream.read (SSLSocketImpl.java:1066)
[2023-04-05T01:41:01.871Z]     at org.apache.http.impl.io.SessionInputBufferImpl.streamRead (SessionInputBufferImpl.java:137)
[2023-04-05T01:41:01.872Z]     at org.apache.http.impl.io.SessionInputBufferImpl.read (SessionInputBufferImpl.java:197)
[2023-04-05T01:41:01.872Z]     at org.apache.http.impl.io.ContentLengthInputStream.read (ContentLengthInputStream.java:176)
[2023-04-05T01:41:01.872Z]     at org.apache.http.conn.EofSensorInputStream.read (EofSensorInputStream.java:135)
[2023-04-05T01:41:01.872Z]     at org.apache.http.conn.EofSensorInputStream.read (EofSensorInputStream.java:148)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.copy (AbstractTransporter.java:163)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.utilGet (AbstractTransporter.java:94)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.transport.http.HttpTransporter.access$200 (HttpTransporter.java:96)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.transport.http.HttpTransporter$EntityGetter.handle (HttpTransporter.java:565)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.transport.http.HttpTransporter.execute (HttpTransporter.java:374)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.transport.http.HttpTransporter.implGet (HttpTransporter.java:319)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.get (AbstractTransporter.java:63)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$GetTaskRunner.runTask (BasicRepositoryConnector.java:482)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run (BasicRepositoryConnector.java:414)
[2023-04-05T01:41:01.872Z]     at org.eclipse.aether.util.concurrency.RunnableErrorForwarder.lambda$wrap$0 (RunnableErrorForwarder.java:66)
[2023-04-05T01:41:01.872Z]     at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1128)
[2023-04-05T01:41:01.872Z]     at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:628)
[2023-04-05T01:41:01.872Z]     at java.lang.Thread.run (Thread.java:829)
basil commented 1 year ago

Seeing the same error in ATH even in Azure with Maven 3.9.1 in https://ci.jenkins.io/job/Core/job/acceptance-test-harness/job/PR-1097/10/ -- giving up on enabling artifact caching proxy for ATH.

[2023-04-05T02:54:23.756Z] + eval export BROWSER_DISPLAY=:42 export DISPLAY=:42
[2023-04-05T02:54:23.756Z] ++ export BROWSER_DISPLAY=:42 export DISPLAY=:42
[2023-04-05T02:54:23.756Z] ++ BROWSER_DISPLAY=:42
[2023-04-05T02:54:23.756Z] ++ DISPLAY=:42
[2023-04-05T02:54:23.756Z] + java -version
[2023-04-05T02:54:23.756Z] openjdk version "11.0.18" 2023-01-17
[2023-04-05T02:54:23.756Z] OpenJDK Runtime Environment (build 11.0.18+10-post-Ubuntu-0ubuntu122.04)
[2023-04-05T02:54:23.756Z] OpenJDK 64-Bit Server VM (build 11.0.18+10-post-Ubuntu-0ubuntu122.04, mixed mode, sharing)
[2023-04-05T02:54:23.756Z] + run.sh firefox lts -Dmaven.repo.local=/home/jenkins/workspace/_acceptance-test-harness_PR-1097@tmp/m2repo -Dmaven.test.failure.ignore=true -DforkCount=1
[2023-04-05T02:54:23.756Z] + BROWSER=firefox
[2023-04-05T02:54:23.756Z] + JENKINS_WAR=target/jenkins-war.war
[2023-04-05T02:54:23.756Z] + mvn -B -V -e -ntp -s /home/jenkins/workspace/_acceptance-test-harness_PR-1097@tmp/config11205400074458065414tmp test -Plts -Dmaven.repo.local=/home/jenkins/workspace/_acceptance-test-harness_PR-1097@tmp/m2repo -Dmaven.test.failure.ignore=true -DforkCount=1
[2023-04-05T02:54:24.022Z] Apache Maven 3.9.1 (2e178502fcdbffc201671fb2537d0cb4b4cc58f8)
[2023-04-05T02:54:24.022Z] Maven home: /opt/maven
[2023-04-05T02:54:24.022Z] Java version: 11.0.18, vendor: Ubuntu, runtime: /usr/lib/jvm/java-11-openjdk-amd64
[2023-04-05T02:54:24.022Z] Default locale: en_US, platform encoding: ANSI_X3.4-1968
[2023-04-05T02:54:24.022Z] OS name: "linux", version: "5.15.0-1035-azure", arch: "amd64", family: "unix"
[2023-04-05T02:54:26.050Z] [ERROR] Error executing Maven.
[2023-04-05T02:54:26.050Z] org.apache.maven.cli.internal.ExtensionResolutionException: Extension io.jenkins.tools.incrementals:git-changelist-maven-extension:1.6 or one of its dependencies could not be resolved: The following artifacts could not be resolved: com.google.inject:guice:jar:5.1.0 (absent): Could not transfer artifact com.google.inject:guice:jar:5.1.0 from/to azure-proxy (https://repo.azure.jenkins.io/public/): Connection reset
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveExtension (BootstrapCoreExtensionManager.java:182)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveCoreExtensions (BootstrapCoreExtensionManager.java:115)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.loadCoreExtensions (BootstrapCoreExtensionManager.java:99)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.loadCoreExtensions (MavenCli.java:677)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.container (MavenCli.java:548)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:266)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:195)
[2023-04-05T02:54:26.050Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-04-05T02:54:26.050Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-04-05T02:54:26.050Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-04-05T02:54:26.050Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-04-05T02:54:26.050Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-04-05T02:54:26.050Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-04-05T02:54:26.050Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-04-05T02:54:26.050Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-04-05T02:54:26.050Z] Caused by: org.eclipse.aether.resolution.ArtifactResolutionException: The following artifacts could not be resolved: com.google.inject:guice:jar:5.1.0 (absent): Could not transfer artifact com.google.inject:guice:jar:5.1.0 from/to azure-proxy (https://repo.azure.jenkins.io/public/): Connection reset
[2023-04-05T02:54:26.050Z]     at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve (DefaultArtifactResolver.java:456)
[2023-04-05T02:54:26.050Z]     at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolveArtifacts (DefaultArtifactResolver.java:261)
[2023-04-05T02:54:26.050Z]     at org.eclipse.aether.internal.impl.DefaultRepositorySystem.resolveDependencies (DefaultRepositorySystem.java:352)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.plugin.internal.DefaultPluginDependenciesResolver.resolveInternal (DefaultPluginDependenciesResolver.java:212)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.plugin.internal.DefaultPluginDependenciesResolver.resolveCoreExtension (DefaultPluginDependenciesResolver.java:148)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveExtension (BootstrapCoreExtensionManager.java:174)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.resolveCoreExtensions (BootstrapCoreExtensionManager.java:115)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.internal.BootstrapCoreExtensionManager.loadCoreExtensions (BootstrapCoreExtensionManager.java:99)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.loadCoreExtensions (MavenCli.java:677)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.container (MavenCli.java:548)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:266)
[2023-04-05T02:54:26.050Z]     at org.apache.maven.cli.MavenCli.main (MavenCli.java:195)
[2023-04-05T02:54:26.050Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
[2023-04-05T02:54:26.050Z]     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
[2023-04-05T02:54:26.050Z]     at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
[2023-04-05T02:54:26.050Z]     at java.lang.reflect.Method.invoke (Method.java:566)
[2023-04-05T02:54:26.051Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
[2023-04-05T02:54:26.051Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
[2023-04-05T02:54:26.051Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
[2023-04-05T02:54:26.051Z]     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[2023-04-05T02:54:26.051Z] Caused by: org.eclipse.aether.transfer.ArtifactTransferException: Could not transfer artifact com.google.inject:guice:jar:5.1.0 from/to azure-proxy (https://repo.azure.jenkins.io/public/): Connection reset
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.connector.basic.ArtifactTransportListener.transferFailed (ArtifactTransportListener.java:44)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run (BasicRepositoryConnector.java:417)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.util.concurrency.RunnableErrorForwarder.lambda$wrap$0 (RunnableErrorForwarder.java:66)
[2023-04-05T02:54:26.051Z]     at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1128)
[2023-04-05T02:54:26.051Z]     at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:628)
[2023-04-05T02:54:26.051Z]     at java.lang.Thread.run (Thread.java:829)
[2023-04-05T02:54:26.051Z] Caused by: java.net.SocketException: Connection reset
[2023-04-05T02:54:26.051Z]     at java.net.SocketInputStream.read (SocketInputStream.java:186)
[2023-04-05T02:54:26.051Z]     at java.net.SocketInputStream.read (SocketInputStream.java:140)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLSocketInputRecord.read (SSLSocketInputRecord.java:484)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLSocketInputRecord.readFully (SSLSocketInputRecord.java:467)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLSocketInputRecord.decodeInputRecord (SSLSocketInputRecord.java:243)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLSocketInputRecord.decode (SSLSocketInputRecord.java:181)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLTransport.decode (SSLTransport.java:111)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLSocketImpl.decode (SSLSocketImpl.java:1510)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLSocketImpl.readApplicationRecord (SSLSocketImpl.java:1477)
[2023-04-05T02:54:26.051Z]     at sun.security.ssl.SSLSocketImpl$AppInputStream.read (SSLSocketImpl.java:1066)
[2023-04-05T02:54:26.051Z]     at org.apache.http.impl.io.SessionInputBufferImpl.streamRead (SessionInputBufferImpl.java:137)
[2023-04-05T02:54:26.051Z]     at org.apache.http.impl.io.SessionInputBufferImpl.read (SessionInputBufferImpl.java:197)
[2023-04-05T02:54:26.051Z]     at org.apache.http.impl.io.ContentLengthInputStream.read (ContentLengthInputStream.java:176)
[2023-04-05T02:54:26.051Z]     at org.apache.http.conn.EofSensorInputStream.read (EofSensorInputStream.java:135)
[2023-04-05T02:54:26.051Z]     at org.apache.http.conn.EofSensorInputStream.read (EofSensorInputStream.java:148)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.copy (AbstractTransporter.java:163)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.utilGet (AbstractTransporter.java:94)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.transport.http.HttpTransporter.access$200 (HttpTransporter.java:96)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.transport.http.HttpTransporter$EntityGetter.handle (HttpTransporter.java:565)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.transport.http.HttpTransporter.execute (HttpTransporter.java:374)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.transport.http.HttpTransporter.implGet (HttpTransporter.java:319)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.spi.connector.transport.AbstractTransporter.get (AbstractTransporter.java:63)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$GetTaskRunner.runTask (BasicRepositoryConnector.java:482)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.connector.basic.BasicRepositoryConnector$TaskRunner.run (BasicRepositoryConnector.java:414)
[2023-04-05T02:54:26.051Z]     at org.eclipse.aether.util.concurrency.RunnableErrorForwarder.lambda$wrap$0 (RunnableErrorForwarder.java:66)
[2023-04-05T02:54:26.051Z]     at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1128)
[2023-04-05T02:54:26.051Z]     at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:628)
[2023-04-05T02:54:26.051Z]     at java.lang.Thread.run (Thread.java:829)
script returned exit code 1
basil commented 1 year ago

Closing as the "unreliable" behavior (which is BOM-only) is tracked in https://github.com/jenkins-infra/helpdesk/issues/3481

I am not sure why scare quotes were used. When trying to enable the artifact caching proxy in ATH, I got failures on both DigitalOcean and Azure, but not when I removed the artifact caching proxy (using Maven 3.9.1 in all three cases). The artifact caching proxy is clearly unreliable, and this is clearly not BOM-only.

dduportal commented 1 year ago

About the ACP Azure problem encountered with ATH:

Capture d’écran 2023-04-05 à 08 57 29 Capture d’écran 2023-04-05 à 08 58 22 Capture d’écran 2023-04-05 à 08 59 47

Nothing obvious at first sight.

Currently checking the metrics of the VMs used as agent by the ATH to see if there has been an error on these machines

dduportal commented 1 year ago

Extracted the list of agent names from https://ci.jenkins.io/job/Core/job/acceptance-test-harness/job/PR-1097/10/ with the command grep -iIr 'running on ubuntu' consoleText | cut -d' ' -f4 | sort -u , adapted it to the following Datadog filter: host:ubuntu-20-ba8a00 OR host:ubuntu-20-highmem0d07c0 OR host:ubuntu-20-highmem0d07c1 OR host:ubuntu-20-highmem165be1 OR host:ubuntu-20-highmem1b0380 OR host:ubuntu-20-highmem2b8f80 OR host:ubuntu-20-highmem2b8f81 OR host:ubuntu-20-highmem34e3e1 OR host:ubuntu-20-highmem4a1770 OR host:ubuntu-20-highmem5a9190 OR host:ubuntu-20-highmem5a9191 OR host:ubuntu-20-highmem5a9192 OR host:ubuntu-20-highmem63e5d0 OR host:ubuntu-20-highmem826dc0 OR host:ubuntu-20-highmem97a160 OR host:ubuntu-20-highmembe07f0 OR host:ubuntu-20-highmembf7da0 OR host:ubuntu-20-highmemcff7d0 OR host:ubuntu-20-highmemdc8fe0 OR host:ubuntu-20-highmemfc8d80 OR host:

=> it appears that only one of these machines reported metrics during the past 24h, and only for a few minutes:

Capture d’écran 2023-04-05 à 09 11 08

while there is a build of the ATH (https://ci.jenkins.io/job/Core/job/acceptance-test-harness/job/master/722/) currently being build and its VM agents are reporting metrics:

Capture d’écran 2023-04-05 à 09 14 06

Azure status does not report any incident though: https://azure.status.microsoft/en-us/status/ and https://azure.status.microsoft/en-us/status/history/.

I'm not sure where to go from here: again, there is nothing pointing at the ACP particularly, but it is hard to diagnose so any idea to help is welcome.

Feels like a network issue, which might be caused by the IP overlap in the legacy networks (ref. https://github.com/jenkins-infra/helpdesk/issues/3351). Proposal: setting the network of the ci.jenkins.io closer to the ACP's to decrease the latency and in a not-overlapped network

dduportal commented 1 year ago

Closing as the "unreliable" behavior (which is BOM-only) is tracked in #3481

I am not sure why scare quotes were used. When trying to enable the artifact caching proxy in ATH, I got failures on both DigitalOcean and Azure, but not when I removed the artifact caching proxy (using Maven 3.9.1 in all three cases). The artifact caching proxy is clearly unreliable, and this is clearly not BOM-only.

I can edit my comment if the quotes bother you. There is nothing to understand from this quotes. It was written by reflex by someone who is not a native English speaker.

Yes, you have issues when enabling the ACP. Now that you reported the issue with the ATH we are looking on it.

basil commented 1 year ago

Scare quotes are used to elicit doubt, so I wanted to be clear that there is a problem. I am happy to step back and let the infrastructure team investigate. I do not have the context and access to dig deeper anyway, and I do not debug problems like this with dashboards and log files — I am more old-school and debug problems like this by doing experiments and running system-level commands to observe the effects. Nevertheless I have never met a debugging problem that I could not eventually solve.

lemeurherve commented 1 year ago

FWIW since the merge of your PR introducing ACP on ATH, there has been 3 builds on master and 3 builds on PRs without any ACP error.

basil commented 1 year ago

FWIW since your PR introducing ACP on ATH, there has been 3 builds on master and 3 builds on PRs without any ACP error.

That is expected, because I removed the lines introducing the artifact caching proxy from that PR:

giving up on enabling artifact caching proxy for ATH.

I would have kept them, if it were not for the proxy being unreliable as described in this ticket.

lemeurherve commented 1 year ago

That is expected, because I removed the lines introducing the artifact caching proxy from that PR

The infra.runMaven function is using artifact caching proxy by default.

This is the function now used in the Jenkinsfile on the master branch since the merge of your PR: https://github.com/jenkinsci/acceptance-test-harness/blob/master/Jenkinsfile#L47

We can see the use of the ACP in the recent master build logs, ex: https://ci.jenkins.io/job/Core/job/acceptance-test-harness/job/master/725/consoleText

[2023-04-05T10:37:00.754Z] INFO: using artifact caching proxy from 'do' provider.

[2023-04-05T10:37:05.571Z] + mvn --batch-mode --show-version --errors --no-transfer-progress -s /home/jenkins/agent/workspace/e_acceptance-test-harness_master@tmp/config14181552121585774481tmp -Dset.changelist -DskipTests -U clean install

basil commented 1 year ago

You seem to be missing the fact that the primary usage of Maven in that job is within run.sh. When I tried to make that use the proxy yesterday, I saw failures on both Azure and DigitalOcean as described in this ticket.

dduportal commented 1 year ago

Starting to work on the network of the ci.jenkins.io's VM agents. Goal: ensure that the subnet where the machines are spawned is close to the ACP and does not suffer from any kind of routing overlap to avoid abrupt cuts in the intermediate networking layers.

dduportal commented 1 year ago

Got an error with the AWS ACP, in the context of a BOM build on its master branch, involving a temporary failure in DNS resolution: https://ci.jenkins.io/blue/organizations/jenkins/Tools%2Fbom/detail/master/1666/pipeline#step-5846-log-248

It's not an issue with ACP itself, but worth mentioning.

dduportal commented 1 year ago

Hello 👋 It's worth retrying ATH artifact caching proxy as https://github.com/jenkins-infra/helpdesk/issues/3535#issuecomment-1607874573 was implemented : that should remove a lot of the network errors seen in the past months

lemeurherve commented 1 year ago

I've opened a PR at https://github.com/jenkinsci/acceptance-test-harness/pull/1279

lemeurherve commented 1 year ago

Since the PR merge two weeks ago I didn't notice particular ATH failures directly related to the artifact caching proxy.

It seems to me we can close this issue, WDYT @basil ?

lemeurherve commented 1 year ago

No objection, closing this issue.

basil commented 1 year ago

Sorry, I missed this notification last week. In any case I have not seen any ACP failures. Thank you very much!