airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
14.94k stars 3.84k forks source link

On restart of EC2 - both source and destination connectors go missing & cannot be pulled #39337

Open slunia opened 1 month ago

slunia commented 1 month ago

Connector Name

shopify

Connector Version

2.2.1

What step the error happened?

During the sync

Relevant information

Hi, our airbyte server on ec2 works normally, however whenever we restart both our source (shopify) and destination (firestore) connectors go missing - and during the sync they do not get automatically pulled .

have to separetely log into ssh and then pull them again for it to work.

can you guide me where is the issue.

Relevant log output

2024-06-07 07:02:00 platform > Docker volume job log path: /tmp/workspace/71/0/logs.log
2024-06-07 07:02:00 platform > Executing worker wrapper. Airbyte version: 0.61.0
2024-06-07 07:02:00 platform > Attempt 0 to save workflow id for cancellation
2024-06-07 07:02:00 platform > start sync worker. job id: 71 attempt id: 0
2024-06-07 07:02:00 platform > 
2024-06-07 07:02:00 platform > ----- START REPLICATION -----
2024-06-07 07:02:00 platform > 
2024-06-07 07:02:00 platform > Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0'
2024-06-07 07:02:00 platform > Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0'
2024-06-07 07:02:00 platform > Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1'
2024-06-07 07:02:00 platform > Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1'
2024-06-07 07:02:00 platform > Running destination...
2024-06-07 07:02:00 platform > Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0'
2024-06-07 07:02:00 platform > Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0'
2024-06-07 07:02:00 platform > Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1'
2024-06-07 07:02:00 platform > Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1'
2024-06-07 07:02:00 platform > Checking if airbyte/source-shopify:2.2.1 exists...
2024-06-07 07:02:00 platform > Checking if airbyte/destination-firestore:0.1.3 exists...
2024-06-07 07:02:00 platform > airbyte/destination-firestore:0.1.3 not found locally. Attempting to pull the image...
2024-06-07 07:02:00 platform > airbyte/source-shopify:2.2.1 not found locally. Attempting to pull the image...
2024-06-07 07:02:00 platform > Image does not exist.
2024-06-07 07:02:00 platform > Image does not exist.
2024-06-07 07:02:01 platform > thread status... timeout thread: false , replication thread: true
2024-06-07 07:02:01 platform > sync summary: {
  "status" : "failed",
  "startTime" : 1717743720197,
  "endTime" : 1717743721024,
  "totalStats" : {
    "bytesEmitted" : 0,
    "destinationStateMessagesEmitted" : 0,
    "destinationWriteEndTime" : 0,
    "destinationWriteStartTime" : 1717743720258,
    "meanSecondsBeforeSourceStateMessageEmitted" : 0,
    "maxSecondsBeforeSourceStateMessageEmitted" : 0,
    "meanSecondsBetweenStateMessageEmittedandCommitted" : 0,
    "recordsEmitted" : 0,
    "replicationEndTime" : 1717743721020,
    "replicationStartTime" : 1717743720197,
    "sourceReadEndTime" : 0,
    "sourceReadStartTime" : 1717743720267,
    "sourceStateMessagesEmitted" : 0
  },
  "streamStats" : [ ],
  "performanceMetrics" : {
    "processFromSource" : {
      "elapsedTimeInNanos" : 0,
      "executionCount" : 0,
      "avgExecTimeInNanos" : "NaN"
    },
    "readFromSource" : {
      "elapsedTimeInNanos" : 0,
      "executionCount" : 0,
      "avgExecTimeInNanos" : "NaN"
    },
    "processFromDest" : {
      "elapsedTimeInNanos" : 0,
      "executionCount" : 0,
      "avgExecTimeInNanos" : "NaN"
    },
    "writeToDest" : {
      "elapsedTimeInNanos" : 0,
      "executionCount" : 0,
      "avgExecTimeInNanos" : "NaN"
    },
    "readFromDest" : {
      "elapsedTimeInNanos" : 0,
      "executionCount" : 0,
      "avgExecTimeInNanos" : "NaN"
    }
  }
}
2024-06-07 07:02:01 platform > failures: [ {
  "failureOrigin" : "replication",
  "internalMessage" : "io.airbyte.workers.exception.WorkerException: Could not find image: airbyte/source-shopify:2.2.1",
  "externalMessage" : "Something went wrong during replication",
  "metadata" : {
    "attemptNumber" : 0,
    "jobId" : 71
  },
  "stacktrace" : "java.lang.RuntimeException: io.airbyte.workers.exception.WorkerException: Could not find image: airbyte/source-shopify:2.2.1\n\tat io.airbyte.workers.general.ReplicationWorkerHelper.startSource(ReplicationWorkerHelper.kt:234)\n\tat io.airbyte.workers.general.BufferedReplicationWorker.lambda$run$1(BufferedReplicationWorker.java:171)\n\tat io.airbyte.workers.general.BufferedReplicationWorker.lambda$runAsync$2(BufferedReplicationWorker.java:235)\n\tat java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)\n\tat java.base/java.lang.Thread.run(Thread.java:1583)\nCaused by: io.airbyte.workers.exception.WorkerException: Could not find image: airbyte/source-shopify:2.2.1\n\tat io.airbyte.workers.process.DockerProcessFactory.create(DockerProcessFactory.java:117)\n\tat io.airbyte.workers.process.AirbyteIntegrationLauncher.read(AirbyteIntegrationLauncher.java:227)\n\tat io.airbyte.workers.internal.DefaultAirbyteSource.start(DefaultAirbyteSource.java:93)\n\tat io.airbyte.workers.general.ReplicationWorkerHelper.startSource(ReplicationWorkerHelper.kt:232)\n\t... 6 more\n",
  "timestamp" : 1717743720993
}, {
  "failureOrigin" : "replication",
  "internalMessage" : "io.airbyte.workers.exception.WorkerException: Could not find image: airbyte/destination-firestore:0.1.3",
  "externalMessage" : "Something went wrong during replication",
  "metadata" : {
    "attemptNumber" : 0,
    "jobId" : 71
  },
  "stacktrace" : "java.lang.RuntimeException: io.airbyte.workers.exception.WorkerException: Could not find image: airbyte/destination-firestore:0.1.3\n\tat io.airbyte.workers.general.ReplicationWorkerHelper.startDestination(ReplicationWorkerHelper.kt:216)\n\tat io.airbyte.workers.general.BufferedReplicationWorker.lambda$run$0(BufferedReplicationWorker.java:170)\n\tat io.airbyte.workers.general.BufferedReplicationWorker.lambda$runAsync$2(BufferedReplicationWorker.java:235)\n\tat java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)\n\tat java.base/java.lang.Thread.run(Thread.java:1583)\nCaused by: io.airbyte.workers.exception.WorkerException: Could not find image: airbyte/destination-firestore:0.1.3\n\tat io.airbyte.workers.process.DockerProcessFactory.create(DockerProcessFactory.java:117)\n\tat io.airbyte.workers.process.AirbyteIntegrationLauncher.write(AirbyteIntegrationLauncher.java:265)\n\tat io.airbyte.workers.internal.DefaultAirbyteDestination.start(DefaultAirbyteDestination.java:110)\n\tat io.airbyte.workers.general.ReplicationWorkerHelper.startDestination(ReplicationWorkerHelper.kt:214)\n\t... 6 more\n",
  "timestamp" : 1717743720993
} ]
2024-06-07 07:02:01 platform > 
2024-06-07 07:02:01 platform > ----- END REPLICATION -----
2024-06-07 07:02:01 platform > 
2024-06-07 07:02:02 platform > Retry State: RetryManager(completeFailureBackoffPolicy=BackoffPolicy(minInterval=PT10S, maxInterval=PT30M, base=3), partialFailureBackoffPolicy=null, successiveCompleteFailureLimit=5, totalCompleteFailureLimit=10, successivePartialFailureLimit=1000, totalPartialFailureLimit=10, successiveCompleteFailures=1, totalCompleteFailures=1, successivePartialFailures=0, totalPartialFailures=0)
 Backoff before next attempt: 10 seconds

Contribute

marcosmarxm commented 1 month ago

What version of the platform are you using? Are you using an external database or local database to store your connections?

hwildwood commented 1 month ago

Same thing happens to me.

Edit: Restarting docker fixed the issue for me.

wwfch-cyrill commented 1 month ago

https://github.com/airbytehq/airbyte-platform/pull/334

marcosmarxm commented 1 month ago

@perangel @bgroff can you take a look into this issue? It looks is impacting couple of users and the fix is a small change in the docker compose file

slunia commented 2 weeks ago

Hi, is this issue fixed please ? @marcosmarxm

wwfch-cyrill commented 2 weeks ago

@slunia It's not fixed, as you can see here on master: https://github.com/airbytehq/airbyte-platform/pull/334/files#diff-3fde9d1a396e140fefc7676e1bd237d67b6864552b6f45af1ebcc27bcd0bb6e9 no changes on that side.

Kleinkind commented 2 weeks ago

We faced the same Problem on GCP. The workaround you proposed in this comment solved it for us:

create a docker-compose.override.yaml containing:

services:
  docker-proxy:
      container_name: docker-proxy
      restart: unless-stopped

Thank you