airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.27k stars 3.94k forks source link

Docker - Can't find/pull source-image during Sync #40642

Open michaelsonnle opened 1 month ago

michaelsonnle commented 1 month ago

Platform Version

0.63.3

What step the error happened?

During the Sync

Relevant information

We are currently getting the following error under v0.63.0 as well as under the current v0.63.3:

2024-06-30 16:16:45 platform > ----- START CHECK -----
2024-06-30 16:16:45 platform > 
2024-06-30 16:16:45 platform > Checking if airbyte/source-mysql:3.4.11 exists...
2024-06-30 16:16:45 platform > airbyte/source-mysql:3.4.11 not found locally. Attempting to pull the image...
2024-06-30 16:16:45 platform > Image does not exist.
2024-06-30 16:16:45 platform > Unexpected error while checking connection: 
io.airbyte.workers.exception.WorkerException: Could not find image: airbyte/source-mysql:3.4.11

So far, this has happened for all sources twice at intervals of 4 days. We cannot reproduce the error and have already checked whether the image is available on our server using the following commands:

docker pull airbyte/source-mysql:3.4.11
3.4.11: Pulling from airbyte/source-mysql
Digest: sha256:bd47f78cf03f2c21b4fca125439d3aaaf957819d3c40a006dfea318fd964b95c
Status: Image is up to date for airbyte/source-mysql:3.4.11
docker.io/airbyte/source-mysql:3.4.11

It seems that the image can be found locally on the server, but Airbyte cannot access it.

After restarting Airbyte, everything immediately works again as usual and Airbyte also finds all images.

What could be the reason for this?

Relevant log output

2024-06-30 16:12:15 platform > Retry State: RetryManager(completeFailureBackoffPolicy=BackoffPolicy(minInterval=PT10S, maxInterval=PT30M, base=3), partialFailureBackoffPolicy=null, successiveCompleteFailureLimit=5, totalCompleteFailureLimit=10, successivePartialFailureLimit=1000, totalPartialFailureLimit=20, successiveCompleteFailures=4, totalCompleteFailures=4, successivePartialFailures=0, totalPartialFailures=0)
2024-06-30 16:12:15 platform > Backing off for: 4 minutes 30 seconds.
2024-06-30 16:16:45 platform > Docker volume job log path: /tmp/workspace/22932/4/logs.log
2024-06-30 16:16:45 platform > Executing worker wrapper. Airbyte version: 0.63.3
2024-06-30 16:16:45 platform > Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0'
2024-06-30 16:16:45 platform > Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0'
2024-06-30 16:16:45 platform > Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1'
2024-06-30 16:16:45 platform > Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1'
2024-06-30 16:16:45 platform > 
2024-06-30 16:16:45 platform > ----- START CHECK -----
2024-06-30 16:16:45 platform > 
2024-06-30 16:16:45 platform > Checking if airbyte/source-mysql:3.4.11 exists...
2024-06-30 16:16:45 platform > airbyte/source-mysql:3.4.11 not found locally. Attempting to pull the image...
2024-06-30 16:16:45 platform > Image does not exist.
2024-06-30 16:16:45 platform > Unexpected error while checking connection: 
io.airbyte.workers.exception.WorkerException: Could not find image: airbyte/source-mysql:3.4.11
    at io.airbyte.workers.process.DockerProcessFactory.create(DockerProcessFactory.java:117) ~[io.airbyte-airbyte-commons-worker-0.63.3.jar:?]
    at io.airbyte.workers.process.AirbyteIntegrationLauncher.check(AirbyteIntegrationLauncher.java:147) ~[io.airbyte-airbyte-commons-worker-0.63.3.jar:?]
    at io.airbyte.workers.general.DefaultCheckConnectionWorker.run(DefaultCheckConnectionWorker.java:71) ~[io.airbyte-airbyte-commons-worker-0.63.3.jar:?]
    at io.airbyte.workers.general.DefaultCheckConnectionWorker.run(DefaultCheckConnectionWorker.java:44) ~[io.airbyte-airbyte-commons-worker-0.63.3.jar:?]
    at io.airbyte.workers.temporal.TemporalAttemptExecution.get(TemporalAttemptExecution.java:138) ~[io.airbyte-airbyte-workers-0.63.3.jar:?]
    at io.airbyte.workers.temporal.check.connection.CheckConnectionActivityImpl.lambda$runWithJobOutput$1(CheckConnectionActivityImpl.java:227) ~[io.airbyte-airbyte-workers-0.63.3.jar:?]
    at io.airbyte.commons.temporal.HeartbeatUtils.withBackgroundHeartbeat(HeartbeatUtils.java:57) ~[io.airbyte-airbyte-commons-temporal-core-0.63.3.jar:?]
    at io.airbyte.workers.temporal.check.connection.CheckConnectionActivityImpl.runWithJobOutput(CheckConnectionActivityImpl.java:212) ~[io.airbyte-airbyte-workers-0.63.3.jar:?]
    at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) ~[?:?]
    at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[?:?]
    at io.temporal.internal.activity.RootActivityInboundCallsInterceptor$POJOActivityInboundCallsInterceptor.executeActivity(RootActivityInboundCallsInterceptor.java:64) ~[temporal-sdk-1.22.3.jar:?]
    at io.temporal.internal.activity.RootActivityInboundCallsInterceptor.execute(RootActivityInboundCallsInterceptor.java:43) ~[temporal-sdk-1.22.3.jar:?]
    at io.temporal.internal.activity.ActivityTaskExecutors$BaseActivityTaskExecutor.execute(ActivityTaskExecutors.java:107) ~[temporal-sdk-1.22.3.jar:?]
    at io.temporal.internal.activity.ActivityTaskHandlerImpl.handle(ActivityTaskHandlerImpl.java:124) ~[temporal-sdk-1.22.3.jar:?]
    at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handleActivity(ActivityWorker.java:278) ~[temporal-sdk-1.22.3.jar:?]
    at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:243) ~[temporal-sdk-1.22.3.jar:?]
    at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:216) ~[temporal-sdk-1.22.3.jar:?]
    at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:105) ~[temporal-sdk-1.22.3.jar:?]
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
    at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
2024-06-30 16:16:45 platform > 
2024-06-30 16:16:45 platform > ----- END CHECK -----
2024-06-30 16:16:45 platform > 
2024-06-30 16:16:46 platform > Retry State: RetryManager(completeFailureBackoffPolicy=BackoffPolicy(minInterval=PT10S, maxInterval=PT30M, base=3), partialFailureBackoffPolicy=null, successiveCompleteFailureLimit=5, totalCompleteFailureLimit=10, successivePartialFailureLimit=1000, totalPartialFailureLimit=20, successiveCompleteFailures=5, totalCompleteFailures=5, successivePartialFailures=0, totalPartialFailures=0)
 Backoff before next attempt: 13 minutes 30 seconds
2024-06-30 16:16:46 platform > Failing job: 22932, reason: Job failed after too many retries for connection b5abd10c-484c-487e-862e-3c283609ec1e
M-Dahab commented 1 month ago

docker compose down then docker compose up -d seems to solve this for now.