graalvm / native-build-tools

Native-image plugins for various build tools
https://graalvm.github.io/native-build-tools/
Other
366 stars 57 forks source link

Add some resiliency to the collection of reachability metadata #592

Closed wilkinsona closed 4 months ago

wilkinsona commented 5 months ago

Is your feature request related to a problem? Please describe.

We build 10s of native images on a nightly basis and roughly one night in three (at the time of writing) we'll see a failure due to a connection timeout when downloading reachability metadata. I would like our builds to be more robust.

Describe the solution you'd like Builds could be made more robust by configuring a longer connection timeout or by retrying failures that should be transient.

Describe alternatives you've considered I considered some configuration on the Gradle side to retry the task that's failing. Some research suggests that this isn't something that Gradle offers, at least not without resorting to hacks. It also wouldn't benefit other users of GraalVMReachabilityMetadataService.

Additional context

Connection timeout stack trace ``` Caused by: java.net.ConnectException: Connection timed out at org.graalvm.buildtools.gradle.internal.GraalVMReachabilityMetadataService.newRepository(GraalVMReachabilityMetadataService.java:128) at org.graalvm.buildtools.gradle.internal.GraalVMReachabilityMetadataService.(GraalVMReachabilityMetadataService.java:97) at org.graalvm.buildtools.gradle.internal.GraalVMReachabilityMetadataService$Inject.(Unknown Source) at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77) at jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at org.gradle.internal.instantiation.generator.AsmBackedClassGenerator$InvokeConstructorStrategy.newInstance(AsmBackedClassGenerator.java:1913) at org.gradle.internal.instantiation.generator.AbstractClassGenerator$GeneratedClassImpl$GeneratedConstructorImpl.newInstance(AbstractClassGenerator.java:506) at org.gradle.internal.instantiation.generator.DependencyInjectingInstantiator.doCreate(DependencyInjectingInstantiator.java:64) at org.gradle.internal.instantiation.generator.DependencyInjectingInstantiator.newInstance(DependencyInjectingInstantiator.java:55) at org.gradle.api.services.internal.BuildServiceProvider.instantiate(BuildServiceProvider.java:152) at org.gradle.api.services.internal.BuildServiceProvider.instantiate(BuildServiceProvider.java:141) at org.gradle.api.services.internal.BuildServiceProvider.getInstance(BuildServiceProvider.java:128) at org.gradle.api.services.internal.BuildServiceProvider.calculateOwnValue(BuildServiceProvider.java:121) at org.gradle.api.internal.provider.AbstractMinimalProvider.calculateValue(AbstractMinimalProvider.java:107) at org.gradle.api.internal.provider.DefaultProperty.calculateValueFrom(DefaultProperty.java:128) at org.gradle.api.internal.provider.DefaultProperty.calculateValueFrom(DefaultProperty.java:26) at org.gradle.api.internal.provider.AbstractProperty.doCalculateValue(AbstractProperty.java:133) at org.gradle.api.internal.provider.AbstractProperty.calculateOwnValue(AbstractProperty.java:127) at org.gradle.api.internal.provider.AbstractMinimalProvider.calculateOwnPresentValue(AbstractMinimalProvider.java:72) at org.gradle.api.internal.provider.AbstractMinimalProvider.get(AbstractMinimalProvider.java:92) at org.graalvm.buildtools.gradle.tasks.CollectReachabilityMetadata.copyReachabilityMetadata(CollectReachabilityMetadata.java:132) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at org.gradle.internal.reflect.JavaMethod.invoke(JavaMethod.java:125) at org.gradle.api.internal.project.taskfactory.StandardTaskAction.doExecute(StandardTaskAction.java:58) at org.gradle.api.internal.project.taskfactory.StandardTaskAction.execute(StandardTaskAction.java:51) at org.gradle.api.internal.project.taskfactory.StandardTaskAction.execute(StandardTaskAction.java:29) at org.gradle.api.internal.tasks.execution.TaskExecution$3.run(TaskExecution.java:236) at org.gradle.internal.operations.DefaultBuildOperationRunner$1.execute(DefaultBuildOperationRunner.java:29) at org.gradle.internal.operations.DefaultBuildOperationRunner$1.execute(DefaultBuildOperationRunner.java:26) at org.gradle.internal.operations.DefaultBuildOperationRunner$2.execute(DefaultBuildOperationRunner.java:66) at org.gradle.internal.operations.DefaultBuildOperationRunner$2.execute(DefaultBuildOperationRunner.java:59) at org.gradle.internal.operations.DefaultBuildOperationRunner.execute(DefaultBuildOperationRunner.java:157) at org.gradle.internal.operations.DefaultBuildOperationRunner.execute(DefaultBuildOperationRunner.java:59) at org.gradle.internal.operations.DefaultBuildOperationRunner.run(DefaultBuildOperationRunner.java:47) at org.gradle.internal.operations.DefaultBuildOperationExecutor.run(DefaultBuildOperationExecutor.java:68) at org.gradle.api.internal.tasks.execution.TaskExecution.executeAction(TaskExecution.java:221) at org.gradle.api.internal.tasks.execution.TaskExecution.executeActions(TaskExecution.java:204) at org.gradle.api.internal.tasks.execution.TaskExecution.executeWithPreviousOutputFiles(TaskExecution.java:187) at org.gradle.api.internal.tasks.execution.TaskExecution.execute(TaskExecution.java:165) at org.gradle.internal.execution.steps.ExecuteStep.executeInternal(ExecuteStep.java:89) at org.gradle.internal.execution.steps.ExecuteStep.access$000(ExecuteStep.java:40) at org.gradle.internal.execution.steps.ExecuteStep$1.call(ExecuteStep.java:53) at org.gradle.internal.execution.steps.ExecuteStep$1.call(ExecuteStep.java:50) at org.gradle.internal.operations.DefaultBuildOperationRunner$CallableBuildOperationWorker.execute(DefaultBuildOperationRunner.java:204) at org.gradle.internal.operations.DefaultBuildOperationRunner$CallableBuildOperationWorker.execute(DefaultBuildOperationRunner.java:199) at org.gradle.internal.operations.DefaultBuildOperationRunner$2.execute(DefaultBuildOperationRunner.java:66) at org.gradle.internal.operations.DefaultBuildOperationRunner$2.execute(DefaultBuildOperationRunner.java:59) at org.gradle.internal.operations.DefaultBuildOperationRunner.execute(DefaultBuildOperationRunner.java:157) at org.gradle.internal.operations.DefaultBuildOperationRunner.execute(DefaultBuildOperationRunner.java:59) at org.gradle.internal.operations.DefaultBuildOperationRunner.call(DefaultBuildOperationRunner.java:53) at org.gradle.internal.operations.DefaultBuildOperationExecutor.call(DefaultBuildOperationExecutor.java:73) at org.gradle.internal.execution.steps.ExecuteStep.execute(ExecuteStep.java:50) at org.gradle.internal.execution.steps.ExecuteStep.execute(ExecuteStep.java:40) at org.gradle.internal.execution.steps.RemovePreviousOutputsStep.execute(RemovePreviousOutputsStep.java:68) at org.gradle.internal.execution.steps.RemovePreviousOutputsStep.execute(RemovePreviousOutputsStep.java:38) at org.gradle.internal.execution.steps.CancelExecutionStep.execute(CancelExecutionStep.java:41) at org.gradle.internal.execution.steps.TimeoutStep.executeWithoutTimeout(TimeoutStep.java:74) at org.gradle.internal.execution.steps.TimeoutStep.execute(TimeoutStep.java:55) at org.gradle.internal.execution.steps.CreateOutputsStep.execute(CreateOutputsStep.java:51) at org.gradle.internal.execution.steps.CreateOutputsStep.execute(CreateOutputsStep.java:29) at org.gradle.internal.execution.steps.CaptureStateAfterExecutionStep.executeDelegateBroadcastingChanges(CaptureStateAfterExecutionStep.java:124) at org.gradle.internal.execution.steps.CaptureStateAfterExecutionStep.execute(CaptureStateAfterExecutionStep.java:80) at org.gradle.internal.execution.steps.CaptureStateAfterExecutionStep.execute(CaptureStateAfterExecutionStep.java:58) at org.gradle.internal.execution.steps.ResolveInputChangesStep.execute(ResolveInputChangesStep.java:48) at org.gradle.internal.execution.steps.ResolveInputChangesStep.execute(ResolveInputChangesStep.java:36) at org.gradle.internal.execution.steps.BuildCacheStep.executeWithoutCache(BuildCacheStep.java:181) at org.gradle.internal.execution.steps.BuildCacheStep.lambda$execute$1(BuildCacheStep.java:71) at org.gradle.internal.Either$Right.fold(Either.java:175) at org.gradle.internal.execution.caching.CachingState.fold(CachingState.java:59) at org.gradle.internal.execution.steps.BuildCacheStep.execute(BuildCacheStep.java:69) at org.gradle.internal.execution.steps.BuildCacheStep.execute(BuildCacheStep.java:47) at org.gradle.internal.execution.steps.StoreExecutionStateStep.execute(StoreExecutionStateStep.java:36) at org.gradle.internal.execution.steps.StoreExecutionStateStep.execute(StoreExecutionStateStep.java:25) at org.gradle.internal.execution.steps.RecordOutputsStep.execute(RecordOutputsStep.java:36) at org.gradle.internal.execution.steps.RecordOutputsStep.execute(RecordOutputsStep.java:22) at org.gradle.internal.execution.steps.SkipUpToDateStep.executeBecause(SkipUpToDateStep.java:110) at org.gradle.internal.execution.steps.SkipUpToDateStep.lambda$execute$2(SkipUpToDateStep.java:56) at org.gradle.internal.execution.steps.SkipUpToDateStep.execute(SkipUpToDateStep.java:56) at org.gradle.internal.execution.steps.SkipUpToDateStep.execute(SkipUpToDateStep.java:38) at org.gradle.internal.execution.steps.ResolveChangesStep.execute(ResolveChangesStep.java:73) at org.gradle.internal.execution.steps.ResolveChangesStep.execute(ResolveChangesStep.java:44) at org.gradle.internal.execution.steps.legacy.MarkSnapshottingInputsFinishedStep.execute(MarkSnapshottingInputsFinishedStep.java:37) at org.gradle.internal.execution.steps.legacy.MarkSnapshottingInputsFinishedStep.execute(MarkSnapshottingInputsFinishedStep.java:27) at org.gradle.internal.execution.steps.ResolveCachingStateStep.execute(ResolveCachingStateStep.java:89) at org.gradle.internal.execution.steps.ResolveCachingStateStep.execute(ResolveCachingStateStep.java:50) at org.gradle.internal.execution.steps.ValidateStep.execute(ValidateStep.java:102) at org.gradle.internal.execution.steps.ValidateStep.execute(ValidateStep.java:57) at org.gradle.internal.execution.steps.CaptureStateBeforeExecutionStep.execute(CaptureStateBeforeExecutionStep.java:76) at org.gradle.internal.execution.steps.CaptureStateBeforeExecutionStep.execute(CaptureStateBeforeExecutionStep.java:50) at org.gradle.internal.execution.steps.SkipEmptyWorkStep.executeWithNoEmptySources(SkipEmptyWorkStep.java:254) at org.gradle.internal.execution.steps.SkipEmptyWorkStep.execute(SkipEmptyWorkStep.java:91) at org.gradle.internal.execution.steps.SkipEmptyWorkStep.execute(SkipEmptyWorkStep.java:56) at org.gradle.internal.execution.steps.RemoveUntrackedExecutionStateStep.execute(RemoveUntrackedExecutionStateStep.java:32) at org.gradle.internal.execution.steps.RemoveUntrackedExecutionStateStep.execute(RemoveUntrackedExecutionStateStep.java:21) at org.gradle.internal.execution.steps.legacy.MarkSnapshottingInputsStartedStep.execute(MarkSnapshottingInputsStartedStep.java:38) at org.gradle.internal.execution.steps.LoadPreviousExecutionStateStep.execute(LoadPreviousExecutionStateStep.java:43) at org.gradle.internal.execution.steps.LoadPreviousExecutionStateStep.execute(LoadPreviousExecutionStateStep.java:31) at org.gradle.internal.execution.steps.AssignWorkspaceStep.lambda$execute$0(AssignWorkspaceStep.java:40) at org.gradle.api.internal.tasks.execution.TaskExecution$4.withWorkspace(TaskExecution.java:281) at org.gradle.internal.execution.steps.AssignWorkspaceStep.execute(AssignWorkspaceStep.java:40) at org.gradle.internal.execution.steps.AssignWorkspaceStep.execute(AssignWorkspaceStep.java:30) at org.gradle.internal.execution.steps.IdentityCacheStep.execute(IdentityCacheStep.java:37) at org.gradle.internal.execution.steps.IdentityCacheStep.execute(IdentityCacheStep.java:27) at org.gradle.internal.execution.steps.IdentifyStep.execute(IdentifyStep.java:44) at org.gradle.internal.execution.steps.IdentifyStep.execute(IdentifyStep.java:33) at org.gradle.internal.execution.impl.DefaultExecutionEngine$1.execute(DefaultExecutionEngine.java:76) at org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.executeIfValid(ExecuteActionsTaskExecuter.java:139) at org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.execute(ExecuteActionsTaskExecuter.java:128) at org.gradle.api.internal.tasks.execution.CleanupStaleOutputsExecuter.execute(CleanupStaleOutputsExecuter.java:77) at org.gradle.api.internal.tasks.execution.FinalizePropertiesTaskExecuter.execute(FinalizePropertiesTaskExecuter.java:46) at org.gradle.api.internal.tasks.execution.ResolveTaskExecutionModeExecuter.execute(ResolveTaskExecutionModeExecuter.java:51) at org.gradle.api.internal.tasks.execution.SkipTaskWithNoActionsExecuter.execute(SkipTaskWithNoActionsExecuter.java:57) at org.gradle.api.internal.tasks.execution.SkipOnlyIfTaskExecuter.execute(SkipOnlyIfTaskExecuter.java:57) at org.gradle.api.internal.tasks.execution.CatchExceptionTaskExecuter.execute(CatchExceptionTaskExecuter.java:36) at org.gradle.api.internal.tasks.execution.EventFiringTaskExecuter$1.executeTask(EventFiringTaskExecuter.java:77) at org.gradle.api.internal.tasks.execution.EventFiringTaskExecuter$1.call(EventFiringTaskExecuter.java:55) at org.gradle.api.internal.tasks.execution.EventFiringTaskExecuter$1.call(EventFiringTaskExecuter.java:52) at org.gradle.internal.operations.DefaultBuildOperationRunner$CallableBuildOperationWorker.execute(DefaultBuildOperationRunner.java:204) at org.gradle.internal.operations.DefaultBuildOperationRunner$CallableBuildOperationWorker.execute(DefaultBuildOperationRunner.java:199) at org.gradle.internal.operations.DefaultBuildOperationRunner$2.execute(DefaultBuildOperationRunner.java:66) at org.gradle.internal.operations.DefaultBuildOperationRunner$2.execute(DefaultBuildOperationRunner.java:59) at org.gradle.internal.operations.DefaultBuildOperationRunner.execute(DefaultBuildOperationRunner.java:157) at org.gradle.internal.operations.DefaultBuildOperationRunner.execute(DefaultBuildOperationRunner.java:59) at org.gradle.internal.operations.DefaultBuildOperationRunner.call(DefaultBuildOperationRunner.java:53) at org.gradle.internal.operations.DefaultBuildOperationExecutor.call(DefaultBuildOperationExecutor.java:73) at org.gradle.api.internal.tasks.execution.EventFiringTaskExecuter.execute(EventFiringTaskExecuter.java:52) at org.gradle.execution.plan.LocalTaskNodeExecutor.execute(LocalTaskNodeExecutor.java:69) at org.gradle.execution.taskgraph.DefaultTaskExecutionGraph$InvokeNodeExecutorsAction.execute(DefaultTaskExecutionGraph.java:322) at org.gradle.execution.taskgraph.DefaultTaskExecutionGraph$InvokeNodeExecutorsAction.execute(DefaultTaskExecutionGraph.java:309) at org.gradle.execution.taskgraph.DefaultTaskExecutionGraph$BuildOperationAwareExecutionAction.execute(DefaultTaskExecutionGraph.java:302) at org.gradle.execution.taskgraph.DefaultTaskExecutionGraph$BuildOperationAwareExecutionAction.execute(DefaultTaskExecutionGraph.java:288) at org.gradle.execution.plan.DefaultPlanExecutor$ExecutorWorker.execute(DefaultPlanExecutor.java:462) at org.gradle.execution.plan.DefaultPlanExecutor$ExecutorWorker.run(DefaultPlanExecutor.java:379) at org.gradle.execution.plan.DefaultPlanExecutor.process(DefaultPlanExecutor.java:115) at org.gradle.execution.taskgraph.DefaultTaskExecutionGraph.executeWithServices(DefaultTaskExecutionGraph.java:135) at org.gradle.execution.taskgraph.DefaultTaskExecutionGraph.execute(DefaultTaskExecutionGraph.java:120) at org.gradle.execution.SelectedTaskExecutionAction.execute(SelectedTaskExecutionAction.java:35) at org.gradle.execution.DryRunBuildExecutionAction.execute(DryRunBuildExecutionAction.java:51) at org.gradle.execution.BuildOperationFiringBuildWorkerExecutor$ExecuteTasks.call(BuildOperationFiringBuildWorkerExecutor.java:54) at org.gradle.execution.BuildOperationFiringBuildWorkerExecutor$ExecuteTasks.call(BuildOperationFiringBuildWorkerExecutor.java:43) at org.gradle.internal.operations.DefaultBuildOperationRunner$CallableBuildOperationWorker.execute(DefaultBuildOperationRunner.java:204) at org.gradle.internal.operations.DefaultBuildOperationRunner$CallableBuildOperationWorker.execute(DefaultBuildOperationRunner.java:199) at org.gradle.internal.operations.DefaultBuildOperationRunner$2.execute(DefaultBuildOperationRunner.java:66) at org.gradle.internal.operations.DefaultBuildOperationRunner$2.execute(DefaultBuildOperationRunner.java:59) at org.gradle.internal.operations.DefaultBuildOperationRunner.execute(DefaultBuildOperationRunner.java:157) at org.gradle.internal.operations.DefaultBuildOperationRunner.execute(DefaultBuildOperationRunner.java:59) at org.gradle.internal.operations.DefaultBuildOperationRunner.call(DefaultBuildOperationRunner.java:53) at org.gradle.internal.operations.DefaultBuildOperationExecutor.call(DefaultBuildOperationExecutor.java:73) at org.gradle.execution.BuildOperationFiringBuildWorkerExecutor.execute(BuildOperationFiringBuildWorkerExecutor.java:40) at org.gradle.internal.build.DefaultBuildLifecycleController.lambda$executeTasks$6(DefaultBuildLifecycleController.java:167) at org.gradle.internal.model.StateTransitionController.doTransition(StateTransitionController.java:247) at org.gradle.internal.model.StateTransitionController.lambda$tryTransition$7(StateTransitionController.java:174) at org.gradle.internal.work.DefaultSynchronizer.withLock(DefaultSynchronizer.java:44) at org.gradle.internal.model.StateTransitionController.tryTransition(StateTransitionController.java:174) at org.gradle.internal.build.DefaultBuildLifecycleController.executeTasks(DefaultBuildLifecycleController.java:167) at org.gradle.internal.build.DefaultBuildWorkGraphController$DefaultBuildWorkGraph.runWork(DefaultBuildWorkGraphController.java:190) at org.gradle.internal.work.DefaultWorkerLeaseService.withLocks(DefaultWorkerLeaseService.java:249) at org.gradle.internal.work.DefaultWorkerLeaseService.runAsWorkerThread(DefaultWorkerLeaseService.java:109) at org.gradle.composite.internal.DefaultBuildController.doRun(DefaultBuildController.java:172) at org.gradle.composite.internal.DefaultBuildController.access$000(DefaultBuildController.java:47) at org.gradle.composite.internal.DefaultBuildController$BuildOpRunnable.run(DefaultBuildController.java:191) at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64) at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:49) ```
melix commented 5 months ago

Are you using a custom repository URL? The default should be to pick the release in Maven Central. I'm not sure that increasing the connect timeout would be helpful in general, but retries with exponential backoff may be.

wilkinsona commented 5 months ago

Thanks for taking a look, @melix. We're using the default repository.

melix commented 5 months ago

Right, it's quite curious, because in this case the repository should be resolved as a Maven artifact. I don't think we'd have the ability to retry in this case. Also it would be strange that this particular dependency fails but not the regular dependency resolution. Unless if you are using mirrors?

wilkinsona commented 5 months ago

No mirrors involved. The builds are running on GitHub Actions and use Maven Central for the vast majority of their dependencies, with a little bit of https://repo.spring.io mixed in when we're testing snapshots.

wilkinsona commented 5 months ago

Sorry, I may not have been entirely accurate in my earlier comment about the repository URL. Looking at the code in computeMetadataRepositoryUri, I think I've learned that a custom metadata version will result in a custom repository URL that points to github.com being used. We're setting the version to 0.3.7.

melix commented 5 months ago

Ah thanks, that makes more sense to me, then you're probably hitting rate limiting from GitHub APIs, we've seen this (in another context) for Micronaut builds too.

Something you might want to consider is actually to download the repository zip file and include it in your sources, then configure the extension to point to it. Not sure how much this is an option for you.

wilkinsona commented 5 months ago

AFAIK, GitHub doesn't rate-limit download links. When it does rate limit, from what I have seen it's done with a 403 response rather than by refusing a connection.

Experimenting with curl, https://github.com/oracle/graalvm-reachability-metadata/releases/download/0.3.7/graalvm-reachability-metadata-0.3.7.zip redirects to https://objects.githubusercontent.com which, judging by the query parameters such as ?X-Amz-Algorithm=AWS4-HMAC-SHA256, is an S3 bucket. I don't know for certain as it's not apparent from the failure, but this feels like a regular connection timeout when connecting to S3 for which retries (or a longer timeout) could help.

Not sure how much this is an option for you.

It's certainly an option. Thanks for the suggestion. That said, I think an improvement here would be better as it would have broader benefit. I am happy to contribute something if we can agree on the details (number of retries, back off approach, etc).

melix commented 5 months ago

Sure, we will implement retries in case we don't download from Central, my suggestion was a workaround in the meantime.