StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
8.97k stars 1.8k forks source link

Airbyte connector. Source (parquet) -> Target (StarRocks allin1 container) #24772

Closed alberttwong closed 1 year ago

alberttwong commented 1 year ago

Using StarRocks allin1 docker container with Airbyte.

2023-06-06 18:29:49 INFO i.a.w.t.TemporalAttemptExecution(get):136 - Docker volume job log path: /tmp/workspace/2/0/logs.log
2023-06-06 18:29:49 INFO i.a.w.t.TemporalAttemptExecution(get):141 - Executing worker wrapper. Airbyte version: 0.44.5
2023-06-06 18:29:49 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to save workflow id for cancellation
2023-06-06 18:29:49 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:29:49 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:29:49 INFO i.a.c.i.LineGobbler(voidCall):149 - ----- START CHECK -----
2023-06-06 18:29:49 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:29:49 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:29:49 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:29:49 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:29:49 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable LAUNCHDARKLY_KEY: ''
2023-06-06 18:29:49 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable FEATURE_FLAG_CLIENT: ''
2023-06-06 18:29:49 INFO i.a.c.i.LineGobbler(voidCall):149 - Checking if airbyte/source-file:0.3.4 exists...
2023-06-06 18:29:49 INFO i.a.c.i.LineGobbler(voidCall):149 - airbyte/source-file:0.3.4 was found locally.
2023-06-06 18:29:49 INFO i.a.w.p.DockerProcessFactory(create):139 - Creating docker container = source-file-check-2-0-uwynj with resources io.airbyte.config.ResourceRequirements@ea18cd0[cpuRequest=,cpuLimit=,memoryRequest=,memoryLimit=,additionalProperties={}] and allowedHosts io.airbyte.config.AllowedHosts@4ad66274[hosts=[*, *.datadoghq.com, *.datadoghq.eu, *.sentry.io],additionalProperties={}]
2023-06-06 18:29:49 INFO i.a.w.p.DockerProcessFactory(create):192 - Preparing command: docker run --rm --init -i -w /data/2/0 --log-driver none --name source-file-check-2-0-uwynj --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e DEPLOYMENT_MODE=OSS -e WORKER_CONNECTOR_IMAGE=airbyte/source-file:0.3.4 -e AUTO_DETECT_SCHEMA=true -e LAUNCHDARKLY_KEY= -e SOCAT_KUBE_CPU_REQUEST=0.1 -e SOCAT_KUBE_CPU_LIMIT=2.0 -e USE_STREAM_CAPABLE_STATE=true -e FIELD_SELECTION_WORKSPACES= -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_ROLE= -e APPLY_FIELD_SELECTION=false -e WORKER_JOB_ATTEMPT=0 -e OTEL_COLLECTOR_ENDPOINT=http://host.docker.internal:4317 -e FEATURE_FLAG_CLIENT= -e AIRBYTE_VERSION=0.44.5 -e WORKER_JOB_ID=2 airbyte/source-file:0.3.4 check --config source_config.json
2023-06-06 18:29:49 INFO i.a.w.i.VersionedAirbyteStreamFactory(create):181 - Reading messages from protocol version 0.2.0
2023-06-06 18:29:50 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):317 - TransportParams: None
2023-06-06 18:29:52 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):317 - Check succeeded
2023-06-06 18:29:52 INFO i.a.w.g.DefaultCheckConnectionWorker(run):115 - Check connection job received output: io.airbyte.config.StandardCheckConnectionOutput@1f0a7376[status=succeeded,message=<null>,additionalProperties={}]
2023-06-06 18:29:52 INFO i.a.w.t.TemporalAttemptExecution(get):163 - Stopping cancellation check scheduling...
2023-06-06 18:29:52 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:29:52 INFO i.a.c.i.LineGobbler(voidCall):149 - ----- END CHECK -----
2023-06-06 18:29:52 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:29:52 INFO i.a.w.t.TemporalAttemptExecution(get):136 - Docker volume job log path: /tmp/workspace/2/0/logs.log
2023-06-06 18:29:52 INFO i.a.w.t.TemporalAttemptExecution(get):141 - Executing worker wrapper. Airbyte version: 0.44.5
2023-06-06 18:29:52 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to save workflow id for cancellation
2023-06-06 18:29:52 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:29:52 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:29:52 INFO i.a.c.i.LineGobbler(voidCall):149 - ----- START CHECK -----
2023-06-06 18:29:52 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:29:52 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:29:52 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:29:52 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:29:52 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable LAUNCHDARKLY_KEY: ''
2023-06-06 18:29:52 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable FEATURE_FLAG_CLIENT: ''
2023-06-06 18:29:52 INFO i.a.c.i.LineGobbler(voidCall):149 - Checking if atwong/destination-starrocks:latest exists...
2023-06-06 18:29:52 INFO i.a.c.i.LineGobbler(voidCall):149 - atwong/destination-starrocks:latest was found locally.
2023-06-06 18:29:52 INFO i.a.w.p.DockerProcessFactory(create):139 - Creating docker container = destination-starrocks-check-2-0-hxzqx with resources io.airbyte.config.ResourceRequirements@678cd77d[cpuRequest=,cpuLimit=,memoryRequest=,memoryLimit=,additionalProperties={}] and allowedHosts null
2023-06-06 18:29:52 INFO i.a.w.p.DockerProcessFactory(create):192 - Preparing command: docker run --rm --init -i -w /data/2/0 --log-driver none --name destination-starrocks-check-2-0-hxzqx --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e DEPLOYMENT_MODE=OSS -e WORKER_CONNECTOR_IMAGE=atwong/destination-starrocks:latest -e AUTO_DETECT_SCHEMA=true -e LAUNCHDARKLY_KEY= -e SOCAT_KUBE_CPU_REQUEST=0.1 -e SOCAT_KUBE_CPU_LIMIT=2.0 -e USE_STREAM_CAPABLE_STATE=true -e FIELD_SELECTION_WORKSPACES= -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_ROLE= -e APPLY_FIELD_SELECTION=false -e WORKER_JOB_ATTEMPT=0 -e OTEL_COLLECTOR_ENDPOINT=http://host.docker.internal:4317 -e FEATURE_FLAG_CLIENT= -e AIRBYTE_VERSION=0.44.5 -e WORKER_JOB_ID=2 atwong/destination-starrocks:latest check --config source_config.json
2023-06-06 18:29:53 INFO i.a.w.i.VersionedAirbyteStreamFactory(create):181 - Reading messages from protocol version 0.2.0
2023-06-06 18:29:53 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):317 - INFO i.a.i.b.IntegrationCliParser(parseOptions):126 integration args: {check=null, config=source_config.json}
2023-06-06 18:29:53 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):317 - INFO i.a.i.b.IntegrationRunner(runInternal):108 Running integration: io.airbyte.integrations.destination.starrocks.StarRocksDestination
2023-06-06 18:29:53 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):317 - INFO i.a.i.b.IntegrationRunner(runInternal):109 Command: CHECK
2023-06-06 18:29:53 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):317 - INFO i.a.i.b.IntegrationRunner(runInternal):110 Integration config: IntegrationConfig{command=CHECK, configPath='source_config.json', catalogPath='null', statePath='null'}
2023-06-06 18:29:53 WARN i.a.w.i.VersionedAirbyteStreamFactory(internalLog):314 - WARN c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword order - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2023-06-06 18:29:53 WARN i.a.w.i.VersionedAirbyteStreamFactory(internalLog):314 - WARN c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword airbyte_secret - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2023-06-06 18:29:56 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):317 - INFO i.a.i.b.IntegrationRunner(runInternal):186 Completed integration: io.airbyte.integrations.destination.starrocks.StarRocksDestination
2023-06-06 18:29:56 INFO i.a.w.g.DefaultCheckConnectionWorker(run):115 - Check connection job received output: io.airbyte.config.StandardCheckConnectionOutput@27376230[status=succeeded,message=<null>,additionalProperties={}]
2023-06-06 18:29:56 INFO i.a.w.t.TemporalAttemptExecution(get):163 - Stopping cancellation check scheduling...
2023-06-06 18:29:56 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:29:56 INFO i.a.c.i.LineGobbler(voidCall):149 - ----- END CHECK -----
2023-06-06 18:29:56 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:29:56 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to get state
2023-06-06 18:29:56 INFO i.a.w.h.DockerImageNameHelper(extractImageVersion):62 - Could not create semantic version from version latest, message: Invalid version string: latest
2023-06-06 18:29:56 INFO i.a.w.h.NormalizationInDestinationHelper(shouldNormalizeInDestination):52 - Requires Normalization: false Normalization Supported: false, Feature Flag Enabled: false
2023-06-06 18:29:56 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to set attempt sync config
2023-06-06 18:29:56 INFO i.a.c.t.s.DefaultTaskQueueMapper(getTaskQueue):31 - Called DefaultTaskQueueMapper getTaskQueue for geography auto
2023-06-06 18:29:59 INFO i.a.w.t.TemporalAttemptExecution(get):136 - Docker volume job log path: /tmp/workspace/2/0/logs.log
2023-06-06 18:29:59 INFO i.a.w.t.TemporalAttemptExecution(get):141 - Executing worker wrapper. Airbyte version: 0.44.5
2023-06-06 18:29:59 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to save workflow id for cancellation
2023-06-06 18:29:59 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to get the source for heartbeat
2023-06-06 18:29:59 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to get the source definition
2023-06-06 18:29:59 INFO i.a.w.g.ReplicationWorkerFactory(create):96 - Setting up source...
2023-06-06 18:29:59 INFO i.a.w.g.ReplicationWorkerFactory(create):100 - Setting up destination...
2023-06-06 18:29:59 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable METRIC_CLIENT: ''
2023-06-06 18:29:59 WARN i.a.m.l.MetricClientFactory(initialize):60 - Metric client is already initialized to 
2023-06-06 18:29:59 INFO i.a.w.g.ReplicationWorkerFactory(create):109 - Setting up replication worker...
2023-06-06 18:29:59 WARN c.n.s.JsonMetaSchema(newValidator):278 - Unknown keyword example - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2023-06-06 18:29:59 WARN c.n.s.JsonMetaSchema(newValidator):278 - Unknown keyword existingJavaType - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2023-06-06 18:29:59 INFO i.a.w.g.DefaultReplicationWorker(run):147 - start sync worker. job id: 2 attempt id: 0
2023-06-06 18:29:59 INFO i.a.w.g.DefaultReplicationWorker(run):149 - Committing states from replication activity
2023-06-06 18:29:59 INFO i.a.w.g.DefaultReplicationWorker(run):152 - Committing stats from replication activity
2023-06-06 18:29:59 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:29:59 INFO i.a.c.i.LineGobbler(voidCall):149 - ----- START REPLICATION -----
2023-06-06 18:29:59 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:29:59 INFO i.a.w.g.DefaultReplicationWorker(run):168 - configured sync modes: {null.nyc=full_refresh - overwrite}
2023-06-06 18:29:59 INFO i.a.w.i.DefaultAirbyteDestination(start):92 - Running destination...
2023-06-06 18:29:59 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:29:59 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:29:59 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:29:59 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:29:59 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable LAUNCHDARKLY_KEY: ''
2023-06-06 18:29:59 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable FEATURE_FLAG_CLIENT: ''
2023-06-06 18:29:59 INFO i.a.c.i.LineGobbler(voidCall):149 - Checking if atwong/destination-starrocks:latest exists...
2023-06-06 18:29:59 INFO i.a.c.i.LineGobbler(voidCall):149 - atwong/destination-starrocks:latest was found locally.
2023-06-06 18:29:59 INFO i.a.w.p.DockerProcessFactory(create):139 - Creating docker container = destination-starrocks-write-2-0-vxsxk with resources io.airbyte.config.ResourceRequirements@52c4c092[cpuRequest=,cpuLimit=,memoryRequest=,memoryLimit=,additionalProperties={}] and allowedHosts null
2023-06-06 18:29:59 INFO i.a.w.p.DockerProcessFactory(create):192 - Preparing command: docker run --rm --init -i -w /data/2/0 --log-driver none --name destination-starrocks-write-2-0-vxsxk --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e DEPLOYMENT_MODE=OSS -e WORKER_CONNECTOR_IMAGE=atwong/destination-starrocks:latest -e AUTO_DETECT_SCHEMA=true -e LAUNCHDARKLY_KEY= -e SOCAT_KUBE_CPU_REQUEST=0.1 -e SOCAT_KUBE_CPU_LIMIT=2.0 -e USE_STREAM_CAPABLE_STATE=true -e FIELD_SELECTION_WORKSPACES= -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_ROLE= -e APPLY_FIELD_SELECTION=false -e WORKER_JOB_ATTEMPT=0 -e OTEL_COLLECTOR_ENDPOINT=http://host.docker.internal:4317 -e FEATURE_FLAG_CLIENT= -e AIRBYTE_VERSION=0.44.5 -e WORKER_JOB_ID=2 atwong/destination-starrocks:latest write --config destination_config.json --catalog destination_catalog.json
2023-06-06 18:29:59 INFO i.a.w.i.VersionedAirbyteMessageBufferedWriterFactory(createWriter):41 - Writing messages to protocol version 0.2.0
2023-06-06 18:29:59 INFO i.a.w.i.VersionedAirbyteStreamFactory(create):181 - Reading messages from protocol version 0.2.0
2023-06-06 18:29:59 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:29:59 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:29:59 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:29:59 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:29:59 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable LAUNCHDARKLY_KEY: ''
2023-06-06 18:29:59 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable FEATURE_FLAG_CLIENT: ''
2023-06-06 18:29:59 INFO i.a.c.i.LineGobbler(voidCall):149 - Checking if airbyte/source-file:0.3.4 exists...
2023-06-06 18:29:59 INFO i.a.c.i.LineGobbler(voidCall):149 - airbyte/source-file:0.3.4 was found locally.
2023-06-06 18:29:59 INFO i.a.w.p.DockerProcessFactory(create):139 - Creating docker container = source-file-read-2-0-kkmhn with resources io.airbyte.config.ResourceRequirements@428aa599[cpuRequest=,cpuLimit=,memoryRequest=,memoryLimit=,additionalProperties={}] and allowedHosts io.airbyte.config.AllowedHosts@68d47654[hosts=[*, *.datadoghq.com, *.datadoghq.eu, *.sentry.io],additionalProperties={}]
2023-06-06 18:29:59 INFO i.a.w.p.DockerProcessFactory(create):192 - Preparing command: docker run --rm --init -i -w /data/2/0 --log-driver none --name source-file-read-2-0-kkmhn --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e DEPLOYMENT_MODE=OSS -e WORKER_CONNECTOR_IMAGE=airbyte/source-file:0.3.4 -e AUTO_DETECT_SCHEMA=true -e LAUNCHDARKLY_KEY= -e SOCAT_KUBE_CPU_REQUEST=0.1 -e SOCAT_KUBE_CPU_LIMIT=2.0 -e USE_STREAM_CAPABLE_STATE=true -e FIELD_SELECTION_WORKSPACES= -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_ROLE= -e APPLY_FIELD_SELECTION=false -e WORKER_JOB_ATTEMPT=0 -e OTEL_COLLECTOR_ENDPOINT=http://host.docker.internal:4317 -e FEATURE_FLAG_CLIENT= -e AIRBYTE_VERSION=0.44.5 -e WORKER_JOB_ID=2 airbyte/source-file:0.3.4 read --config source_config.json --catalog source_catalog.json
2023-06-06 18:29:59 INFO i.a.w.i.VersionedAirbyteStreamFactory(create):181 - Reading messages from protocol version 0.2.0
2023-06-06 18:29:59 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromDstRunnable$4):289 - Destination output thread started.
2023-06-06 18:29:59 INFO i.a.w.i.HeartbeatTimeoutChaperone(runWithHeartbeatThread):94 - Starting source heartbeat check. Will check every 1 minutes.
2023-06-06 18:29:59 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$5):334 - Replication thread started.
2023-06-06 18:30:00 destination > INFO i.a.i.b.IntegrationCliParser(parseOptions):126 integration args: {catalog=destination_catalog.json, write=null, config=destination_config.json}
2023-06-06 18:30:00 destination > INFO i.a.i.b.IntegrationRunner(runInternal):108 Running integration: io.airbyte.integrations.destination.starrocks.StarRocksDestination
2023-06-06 18:30:00 destination > INFO i.a.i.b.IntegrationRunner(runInternal):109 Command: WRITE
2023-06-06 18:30:00 destination > INFO i.a.i.b.IntegrationRunner(runInternal):110 Integration config: IntegrationConfig{command=WRITE, configPath='destination_config.json', catalogPath='destination_catalog.json', statePath='null'}
2023-06-06 18:30:00 destination > WARN c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword order - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2023-06-06 18:30:00 destination > WARN c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword airbyte_secret - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2023-06-06 18:30:00 destination > INFO i.a.i.d.s.StarRocksDestination(getConsumer):65 JsonNode config: 
{
  "fe_host" : "serveo.net",
  "database" : "demo",
  "password":"**********",
  "username" : "root",
  "http_port" : 8030,
  "query_port" : 9030
}
2023-06-06 18:30:00 source > Reading nyc (https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2023-01.parquet)...
2023-06-06 18:30:00 source > TransportParams: None
2023-06-06 18:30:03 destination > INFO i.a.i.d.b.BufferedStreamConsumer(startTracked):144 class io.airbyte.integrations.destination.buffered_stream_consumer.BufferedStreamConsumer started.
2023-06-06 18:30:03 destination > INFO i.a.i.d.s.StarRocksBufferedConsumerFactory(lambda$onStartFunction$0):73 Preparing tmp tables in destination started for 1 streams
2023-06-06 18:30:03 destination > INFO i.a.i.d.s.StarRocksBufferedConsumerFactory(lambda$onStartFunction$0):79 Preparing tmp table in destination started for stream nyc. tmp table name: _airbyte_tmp_wam_nyc
2023-06-06 18:30:04 destination > INFO i.a.i.d.s.StarRocksBufferedConsumerFactory(lambda$onStartFunction$0):86 Preparing tmp tables in destination completed.
2023-06-06 18:30:32 INFO i.a.w.t.TemporalAttemptExecution(lambda$getCancellationChecker$6):231 - Running sync worker cancellation...
2023-06-06 18:30:32 INFO i.a.w.g.DefaultReplicationWorker(cancel):528 - Cancelling replication worker...
2023-06-06 18:30:32 INFO i.a.w.g.ReplicationWorkerHelper(endOfSource):62 - Total records read: 0 (0 bytes)
2023-06-06 18:30:32 INFO i.a.w.i.FieldSelector(reportMetrics):122 - Schema validation was performed to a max of 10 records with errors per stream.
2023-06-06 18:30:33 destination > INFO i.a.i.b.FailureTrackingAirbyteMessageConsumer(close):80 Airbyte message consumer: succeeded.
2023-06-06 18:30:33 destination > INFO i.a.i.d.b.BufferedStreamConsumer(close):255 executing on success close procedure.
2023-06-06 18:30:33 destination > INFO i.a.i.d.s.StarRocksBufferedConsumerFactory(lambda$onCloseFunction$3):116 Finalizing stream nyc. tmp table _airbyte_tmp_wam_nyc, final table nyc
2023-06-06 18:30:33 destination > INFO i.a.i.d.s.DefaultStreamLoader(close):88 Finished stream load, database : demo, tmp table : _airbyte_tmp_wam_nyc
2023-06-06 18:30:33 destination > INFO i.a.i.d.s.StarRocksBufferedConsumerFactory(lambda$onCloseFunction$3):133 Finalizing tables in destination completed.
2023-06-06 18:30:34 destination > INFO i.a.i.d.s.StarRocksBufferedConsumerFactory(lambda$onCloseFunction$3):137 Cleaning tmp tables in destination started for 1 streams
2023-06-06 18:30:34 destination > INFO i.a.i.d.s.StarRocksBufferedConsumerFactory(lambda$onCloseFunction$3):140 Clean tmp table in destination started for stream nyc.tmp table name: _airbyte_tmp_wam_nyc
2023-06-06 18:30:34 destination > INFO i.a.i.d.s.StarRocksBufferedConsumerFactory(lambda$onCloseFunction$3):145 Cleaning tmp tables in destination completed.
2023-06-06 18:30:34 destination > INFO i.a.i.b.IntegrationRunner(runInternal):186 Completed integration: io.airbyte.integrations.destination.starrocks.StarRocksDestination
2023-06-06 18:30:34 ERROR i.a.w.g.DefaultReplicationWorker(replicate):260 - Sync worker failed.
io.airbyte.workers.internal.exception.SourceException: Source process exited with non-zero exit code 137
    at io.airbyte.workers.general.DefaultReplicationWorker.lambda$readFromSrcAndWriteToDstRunnable$5(DefaultReplicationWorker.java:379) ~[io.airbyte-airbyte-commons-worker-0.44.5.jar:?]
    at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804) ~[?:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
    at java.lang.Thread.run(Thread.java:1589) ~[?:?]
    Suppressed: io.airbyte.workers.exception.WorkerException: Source process exit with code 137. This warning is normal if the job was cancelled.
        at io.airbyte.workers.internal.DefaultAirbyteSource.close(DefaultAirbyteSource.java:150) ~[io.airbyte-airbyte-commons-worker-0.44.5.jar:?]
        at io.airbyte.workers.general.DefaultReplicationWorker.replicate(DefaultReplicationWorker.java:200) ~[io.airbyte-airbyte-commons-worker-0.44.5.jar:?]
        at io.airbyte.workers.general.DefaultReplicationWorker.run(DefaultReplicationWorker.java:176) ~[io.airbyte-airbyte-commons-worker-0.44.5.jar:?]
        at io.airbyte.workers.general.DefaultReplicationWorker.run(DefaultReplicationWorker.java:79) ~[io.airbyte-airbyte-commons-worker-0.44.5.jar:?]
        at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$5(TemporalAttemptExecution.java:195) ~[io.airbyte-airbyte-workers-0.44.5.jar:?]
        at java.lang.Thread.run(Thread.java:1589) ~[?:?]
2023-06-06 18:30:34 INFO i.a.w.g.DefaultReplicationWorker(cancel):537 - Cancelling destination...
2023-06-06 18:30:34 INFO i.a.w.i.DefaultAirbyteDestination(cancel):152 - Attempting to cancel destination process...
2023-06-06 18:30:34 INFO i.a.w.i.DefaultAirbyteDestination(cancel):157 - Destination process exists, cancelling...
2023-06-06 18:30:34 INFO i.a.w.i.DefaultAirbyteDestination(cancel):159 - Cancelled destination process!
2023-06-06 18:30:34 INFO i.a.w.g.DefaultReplicationWorker(cancel):545 - Cancelling source...
2023-06-06 18:30:34 INFO i.a.w.i.DefaultAirbyteSource(cancel):157 - Attempting to cancel source process...
2023-06-06 18:30:34 INFO i.a.w.i.DefaultAirbyteSource(cancel):162 - Source process exists, cancelling...
2023-06-06 18:30:34 INFO i.a.w.i.DefaultAirbyteSource(cancel):164 - Cancelled source process!
2023-06-06 18:30:34 INFO i.a.w.t.TemporalAttemptExecution(lambda$getCancellationChecker$6):235 - Interrupting worker thread...
2023-06-06 18:30:34 INFO i.a.w.t.TemporalAttemptExecution(lambda$getCancellationChecker$6):238 - Cancelling completable future...
2023-06-06 18:30:34 INFO i.a.w.t.TemporalAttemptExecution(get):163 - Stopping cancellation check scheduling...
2023-06-06 18:30:34 WARN i.a.c.t.CancellationHandler$TemporalCancellationHandler(checkAndHandleCancellation):60 - Job either timed out or was cancelled.
2023-06-06 18:30:34 INFO i.a.c.t.TemporalUtils(withBackgroundHeartbeat):307 - Stopping temporal heartbeating...
2023-06-06 18:30:34 WARN i.t.i.a.ActivityTaskExecutors$BaseActivityTaskExecutor(execute):114 - Activity failure. ActivityId=03175117-048c-31b1-a22b-91fe00c81767, activityType=Replicate, attempt=1
java.lang.RuntimeException: java.util.concurrent.CancellationException
    at io.airbyte.commons.temporal.TemporalUtils.withBackgroundHeartbeat(TemporalUtils.java:305) ~[io.airbyte-airbyte-commons-temporal-0.44.5.jar:?]
    at io.airbyte.workers.temporal.sync.ReplicationActivityImpl.replicate(ReplicationActivityImpl.java:122) ~[io.airbyte-airbyte-workers-0.44.5.jar:?]
    at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) ~[?:?]
    at java.lang.reflect.Method.invoke(Method.java:578) ~[?:?]
    at io.temporal.internal.activity.RootActivityInboundCallsInterceptor$POJOActivityInboundCallsInterceptor.executeActivity(RootActivityInboundCallsInterceptor.java:64) ~[temporal-sdk-1.17.0.jar:?]
    at io.temporal.internal.activity.RootActivityInboundCallsInterceptor.execute(RootActivityInboundCallsInterceptor.java:43) ~[temporal-sdk-1.17.0.jar:?]
    at io.temporal.internal.activity.ActivityTaskExecutors$BaseActivityTaskExecutor.execute(ActivityTaskExecutors.java:95) ~[temporal-sdk-1.17.0.jar:?]
    at io.temporal.internal.activity.ActivityTaskHandlerImpl.handle(ActivityTaskHandlerImpl.java:92) ~[temporal-sdk-1.17.0.jar:?]
    at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handleActivity(ActivityWorker.java:241) ~[temporal-sdk-1.17.0.jar:?]
    at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:206) ~[temporal-sdk-1.17.0.jar:?]
    at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:179) ~[temporal-sdk-1.17.0.jar:?]
    at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:93) ~[temporal-sdk-1.17.0.jar:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
    at java.lang.Thread.run(Thread.java:1589) ~[?:?]
Caused by: java.util.concurrent.CancellationException
    at java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2510) ~[?:?]
    at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getCancellationChecker$6(TemporalAttemptExecution.java:241) ~[io.airbyte-airbyte-workers-0.44.5.jar:?]
    at io.airbyte.commons.temporal.CancellationHandler$TemporalCancellationHandler.checkAndHandleCancellation(CancellationHandler.java:59) ~[io.airbyte-airbyte-commons-temporal-0.44.5.jar:?]
    at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getCancellationChecker$7(TemporalAttemptExecution.java:244) ~[io.airbyte-airbyte-workers-0.44.5.jar:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:577) ~[?:?]
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:358) ~[?:?]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[?:?]
    ... 3 more
2023-06-06 18:30:34 WARN i.t.i.w.ActivityWorker$TaskHandlerImpl(logExceptionDuringResultReporting):365 - Failure during reporting of activity result to the server. ActivityId = 03175117-048c-31b1-a22b-91fe00c81767, ActivityType = Replicate, WorkflowId=sync_2, WorkflowType=SyncWorkflow, RunId=7ce77335-dc7c-41d0-8d9a-05d9c6937cd6
io.grpc.StatusRuntimeException: NOT_FOUND: workflow execution already completed
    at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271) ~[grpc-stub-1.54.0.jar:1.54.0]
    at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252) ~[grpc-stub-1.54.0.jar:1.54.0]
    at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165) ~[grpc-stub-1.54.0.jar:1.54.0]
    at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.respondActivityTaskFailed(WorkflowServiceGrpc.java:3866) ~[temporal-serviceclient-1.17.0.jar:?]
    at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.lambda$sendReply$1(ActivityWorker.java:320) ~[temporal-sdk-1.17.0.jar:?]
    at io.temporal.internal.retryer.GrpcRetryer.lambda$retry$0(GrpcRetryer.java:52) ~[temporal-serviceclient-1.17.0.jar:?]
    at io.temporal.internal.retryer.GrpcSyncRetryer.retry(GrpcSyncRetryer.java:67) ~[temporal-serviceclient-1.17.0.jar:?]
    at io.temporal.internal.retryer.GrpcRetryer.retryWithResult(GrpcRetryer.java:60) ~[temporal-serviceclient-1.17.0.jar:?]
    at io.temporal.internal.retryer.GrpcRetryer.retry(GrpcRetryer.java:50) ~[temporal-serviceclient-1.17.0.jar:?]
    at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.sendReply(ActivityWorker.java:315) ~[temporal-sdk-1.17.0.jar:?]
    at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handleActivity(ActivityWorker.java:252) ~[temporal-sdk-1.17.0.jar:?]
    at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:206) ~[temporal-sdk-1.17.0.jar:?]
    at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:179) ~[temporal-sdk-1.17.0.jar:?]
    at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:93) ~[temporal-sdk-1.17.0.jar:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
    at java.lang.Thread.run(Thread.java:1589) ~[?:?]
2023-06-06 18:30:34 INFO i.a.w.g.DefaultReplicationWorker(getReplicationOutput):449 - sync summary: {
  "status" : "failed",
  "recordsSynced" : 0,
  "bytesSynced" : 0,
  "startTime" : 1686076199838,
  "endTime" : 1686076234506,
  "totalStats" : {
    "bytesCommitted" : 0,
    "bytesEmitted" : 0,
    "destinationStateMessagesEmitted" : 0,
    "destinationWriteEndTime" : 1686076234486,
    "destinationWriteStartTime" : 1686076199910,
    "meanSecondsBeforeSourceStateMessageEmitted" : 0,
    "maxSecondsBeforeSourceStateMessageEmitted" : 0,
    "maxSecondsBetweenStateMessageEmittedandCommitted" : 0,
    "meanSecondsBetweenStateMessageEmittedandCommitted" : 0,
    "recordsEmitted" : 0,
    "recordsCommitted" : 0,
    "replicationEndTime" : 1686076234488,
    "replicationStartTime" : 1686076199838,
    "sourceReadEndTime" : 1686076232844,
    "sourceReadStartTime" : 1686076199875,
    "sourceStateMessagesEmitted" : 0
  },
  "streamStats" : [ ]
}
2023-06-06 18:30:34 INFO i.a.w.g.DefaultReplicationWorker(getReplicationOutput):450 - failures: [ {
  "failureOrigin" : "source",
  "internalMessage" : "Source process exited with non-zero exit code 137",
  "externalMessage" : "Something went wrong within the source connector",
  "metadata" : {
    "attemptNumber" : 0,
    "jobId" : 2,
    "connector_command" : "read"
  },
  "stacktrace" : "io.airbyte.workers.internal.exception.SourceException: Source process exited with non-zero exit code 137\n\tat io.airbyte.workers.general.DefaultReplicationWorker.lambda$readFromSrcAndWriteToDstRunnable$5(DefaultReplicationWorker.java:379)\n\tat java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)\n\tat java.base/java.lang.Thread.run(Thread.java:1589)\n",
  "timestamp" : 1686076232935
} ]
2023-06-06 18:30:34 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:30:34 INFO i.a.c.i.LineGobbler(voidCall):149 - ----- END REPLICATION -----
2023-06-06 18:30:34 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:30:40 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to Get a connection by connection Id
2023-06-06 18:30:40 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to get the most recent source actor catalog
2023-06-06 18:30:40 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to Get a connection by connection Id
2023-06-06 18:30:32 INFO i.a.w.t.TemporalAttemptExecution(get):136 - Docker volume job log path: /tmp/workspace/2/1/logs.log
2023-06-06 18:30:32 INFO i.a.w.t.TemporalAttemptExecution(get):141 - Executing worker wrapper. Airbyte version: 0.44.5
2023-06-06 18:30:32 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to save workflow id for cancellation
2023-06-06 18:30:32 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:30:32 INFO i.a.c.i.LineGobbler(voidCall):149 - ----- START CHECK -----
2023-06-06 18:30:32 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:30:32 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:30:32 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:30:32 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:30:32 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:30:32 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable LAUNCHDARKLY_KEY: ''
2023-06-06 18:30:32 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable FEATURE_FLAG_CLIENT: ''
2023-06-06 18:30:32 INFO i.a.c.i.LineGobbler(voidCall):149 - Checking if airbyte/source-file:0.3.4 exists...
2023-06-06 18:30:32 INFO i.a.c.i.LineGobbler(voidCall):149 - airbyte/source-file:0.3.4 was found locally.
2023-06-06 18:30:32 INFO i.a.w.p.DockerProcessFactory(create):139 - Creating docker container = source-file-check-2-1-fqsnk with resources io.airbyte.config.ResourceRequirements@592f98f2[cpuRequest=,cpuLimit=,memoryRequest=,memoryLimit=,additionalProperties={}] and allowedHosts io.airbyte.config.AllowedHosts@516504ad[hosts=[*, *.datadoghq.com, *.datadoghq.eu, *.sentry.io],additionalProperties={}]
2023-06-06 18:30:32 INFO i.a.w.p.DockerProcessFactory(create):192 - Preparing command: docker run --rm --init -i -w /data/2/1 --log-driver none --name source-file-check-2-1-fqsnk --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e DEPLOYMENT_MODE=OSS -e WORKER_CONNECTOR_IMAGE=airbyte/source-file:0.3.4 -e AUTO_DETECT_SCHEMA=true -e LAUNCHDARKLY_KEY= -e SOCAT_KUBE_CPU_REQUEST=0.1 -e SOCAT_KUBE_CPU_LIMIT=2.0 -e USE_STREAM_CAPABLE_STATE=true -e FIELD_SELECTION_WORKSPACES= -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_ROLE= -e APPLY_FIELD_SELECTION=false -e WORKER_JOB_ATTEMPT=1 -e OTEL_COLLECTOR_ENDPOINT=http://host.docker.internal:4317 -e FEATURE_FLAG_CLIENT= -e AIRBYTE_VERSION=0.44.5 -e WORKER_JOB_ID=2 airbyte/source-file:0.3.4 check --config source_config.json
2023-06-06 18:30:33 INFO i.a.w.i.VersionedAirbyteStreamFactory(create):181 - Reading messages from protocol version 0.2.0
2023-06-06 18:30:35 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):317 - TransportParams: None
2023-06-06 18:30:36 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):317 - Check succeeded
2023-06-06 18:30:37 INFO i.a.w.g.DefaultCheckConnectionWorker(run):115 - Check connection job received output: io.airbyte.config.StandardCheckConnectionOutput@63d0d871[status=succeeded,message=<null>,additionalProperties={}]
2023-06-06 18:30:37 INFO i.a.w.t.TemporalAttemptExecution(get):163 - Stopping cancellation check scheduling...
2023-06-06 18:30:37 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:30:37 INFO i.a.c.i.LineGobbler(voidCall):149 - ----- END CHECK -----
2023-06-06 18:30:37 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:30:37 INFO i.a.w.t.TemporalAttemptExecution(get):136 - Docker volume job log path: /tmp/workspace/2/1/logs.log
2023-06-06 18:30:37 INFO i.a.w.t.TemporalAttemptExecution(get):141 - Executing worker wrapper. Airbyte version: 0.44.5
2023-06-06 18:30:37 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to save workflow id for cancellation
2023-06-06 18:30:37 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:30:37 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:30:37 INFO i.a.c.i.LineGobbler(voidCall):149 - ----- START CHECK -----
2023-06-06 18:30:37 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:30:37 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:30:37 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:30:37 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:30:37 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable LAUNCHDARKLY_KEY: ''
2023-06-06 18:30:37 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable FEATURE_FLAG_CLIENT: ''
2023-06-06 18:30:37 INFO i.a.c.i.LineGobbler(voidCall):149 - Checking if atwong/destination-starrocks:latest exists...
2023-06-06 18:30:37 INFO i.a.c.i.LineGobbler(voidCall):149 - atwong/destination-starrocks:latest was found locally.
2023-06-06 18:30:37 INFO i.a.w.p.DockerProcessFactory(create):139 - Creating docker container = destination-starrocks-check-2-1-zownh with resources io.airbyte.config.ResourceRequirements@6c2bbb84[cpuRequest=,cpuLimit=,memoryRequest=,memoryLimit=,additionalProperties={}] and allowedHosts null
2023-06-06 18:30:37 INFO i.a.w.p.DockerProcessFactory(create):192 - Preparing command: docker run --rm --init -i -w /data/2/1 --log-driver none --name destination-starrocks-check-2-1-zownh --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e DEPLOYMENT_MODE=OSS -e WORKER_CONNECTOR_IMAGE=atwong/destination-starrocks:latest -e AUTO_DETECT_SCHEMA=true -e LAUNCHDARKLY_KEY= -e SOCAT_KUBE_CPU_REQUEST=0.1 -e SOCAT_KUBE_CPU_LIMIT=2.0 -e USE_STREAM_CAPABLE_STATE=true -e FIELD_SELECTION_WORKSPACES= -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_ROLE= -e APPLY_FIELD_SELECTION=false -e WORKER_JOB_ATTEMPT=1 -e OTEL_COLLECTOR_ENDPOINT=http://host.docker.internal:4317 -e FEATURE_FLAG_CLIENT= -e AIRBYTE_VERSION=0.44.5 -e WORKER_JOB_ID=2 atwong/destination-starrocks:latest check --config source_config.json
2023-06-06 18:30:37 INFO i.a.w.i.VersionedAirbyteStreamFactory(create):181 - Reading messages from protocol version 0.2.0
2023-06-06 18:30:37 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):317 - INFO i.a.i.b.IntegrationCliParser(parseOptions):126 integration args: {check=null, config=source_config.json}
2023-06-06 18:30:37 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):317 - INFO i.a.i.b.IntegrationRunner(runInternal):108 Running integration: io.airbyte.integrations.destination.starrocks.StarRocksDestination
2023-06-06 18:30:37 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):317 - INFO i.a.i.b.IntegrationRunner(runInternal):109 Command: CHECK
2023-06-06 18:30:37 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):317 - INFO i.a.i.b.IntegrationRunner(runInternal):110 Integration config: IntegrationConfig{command=CHECK, configPath='source_config.json', catalogPath='null', statePath='null'}
2023-06-06 18:30:37 WARN i.a.w.i.VersionedAirbyteStreamFactory(internalLog):314 - WARN c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword order - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2023-06-06 18:30:38 WARN i.a.w.i.VersionedAirbyteStreamFactory(internalLog):314 - WARN c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword airbyte_secret - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2023-06-06 18:30:40 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):317 - INFO i.a.i.b.IntegrationRunner(runInternal):186 Completed integration: io.airbyte.integrations.destination.starrocks.StarRocksDestination
2023-06-06 18:30:40 INFO i.a.w.g.DefaultCheckConnectionWorker(run):115 - Check connection job received output: io.airbyte.config.StandardCheckConnectionOutput@3b1896c7[status=succeeded,message=<null>,additionalProperties={}]
2023-06-06 18:30:40 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:30:40 INFO i.a.w.t.TemporalAttemptExecution(get):163 - Stopping cancellation check scheduling...
2023-06-06 18:30:40 INFO i.a.c.i.LineGobbler(voidCall):149 - ----- END CHECK -----
2023-06-06 18:30:40 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:30:40 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to get state
2023-06-06 18:30:40 INFO i.a.w.h.DockerImageNameHelper(extractImageVersion):62 - Could not create semantic version from version latest, message: Invalid version string: latest
2023-06-06 18:30:40 INFO i.a.w.h.NormalizationInDestinationHelper(shouldNormalizeInDestination):52 - Requires Normalization: false Normalization Supported: false, Feature Flag Enabled: false
2023-06-06 18:30:40 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to set attempt sync config
2023-06-06 18:30:40 INFO i.a.c.t.s.DefaultTaskQueueMapper(getTaskQueue):31 - Called DefaultTaskQueueMapper getTaskQueue for geography auto
2023-06-06 18:30:40 INFO i.a.w.t.TemporalAttemptExecution(get):136 - Docker volume job log path: /tmp/workspace/2/1/logs.log
2023-06-06 18:30:40 INFO i.a.w.t.TemporalAttemptExecution(get):141 - Executing worker wrapper. Airbyte version: 0.44.5
2023-06-06 18:30:40 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to save workflow id for cancellation
2023-06-06 18:30:40 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to get the source for heartbeat
2023-06-06 18:30:40 INFO i.a.a.c.AirbyteApiClient(retryWithJitterThrows):222 - Attempt 0 to get the source definition
2023-06-06 18:30:40 INFO i.a.w.g.ReplicationWorkerFactory(create):96 - Setting up source...
2023-06-06 18:30:40 INFO i.a.w.g.ReplicationWorkerFactory(create):100 - Setting up destination...
2023-06-06 18:30:40 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable METRIC_CLIENT: ''
2023-06-06 18:30:40 WARN i.a.m.l.MetricClientFactory(initialize):60 - Metric client is already initialized to 
2023-06-06 18:30:40 INFO i.a.w.g.ReplicationWorkerFactory(create):109 - Setting up replication worker...
2023-06-06 18:30:40 INFO i.a.w.g.DefaultReplicationWorker(run):147 - start sync worker. job id: 2 attempt id: 1
2023-06-06 18:30:40 INFO i.a.w.g.DefaultReplicationWorker(run):149 - Committing states from replication activity
2023-06-06 18:30:40 INFO i.a.w.g.DefaultReplicationWorker(run):152 - Committing stats from replication activity
2023-06-06 18:30:40 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:30:40 INFO i.a.c.i.LineGobbler(voidCall):149 - ----- START REPLICATION -----
2023-06-06 18:30:40 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-06 18:30:40 INFO i.a.w.g.DefaultReplicationWorker(run):168 - configured sync modes: {null.nyc=full_refresh - overwrite}
2023-06-06 18:30:40 INFO i.a.w.i.DefaultAirbyteDestination(start):92 - Running destination...
2023-06-06 18:30:40 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:30:40 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:30:40 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:30:40 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:30:40 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable LAUNCHDARKLY_KEY: ''
2023-06-06 18:30:40 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable FEATURE_FLAG_CLIENT: ''
2023-06-06 18:30:40 INFO i.a.c.i.LineGobbler(voidCall):149 - Checking if atwong/destination-starrocks:latest exists...
2023-06-06 18:30:40 INFO i.a.c.i.LineGobbler(voidCall):149 - atwong/destination-starrocks:latest was found locally.
2023-06-06 18:30:40 INFO i.a.w.p.DockerProcessFactory(create):139 - Creating docker container = destination-starrocks-write-2-1-ejxmw with resources io.airbyte.config.ResourceRequirements@e3ed678[cpuRequest=,cpuLimit=,memoryRequest=,memoryLimit=,additionalProperties={}] and allowedHosts null
2023-06-06 18:30:40 INFO i.a.w.p.DockerProcessFactory(create):192 - Preparing command: docker run --rm --init -i -w /data/2/1 --log-driver none --name destination-starrocks-write-2-1-ejxmw --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e DEPLOYMENT_MODE=OSS -e WORKER_CONNECTOR_IMAGE=atwong/destination-starrocks:latest -e AUTO_DETECT_SCHEMA=true -e LAUNCHDARKLY_KEY= -e SOCAT_KUBE_CPU_REQUEST=0.1 -e SOCAT_KUBE_CPU_LIMIT=2.0 -e USE_STREAM_CAPABLE_STATE=true -e FIELD_SELECTION_WORKSPACES= -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_ROLE= -e APPLY_FIELD_SELECTION=false -e WORKER_JOB_ATTEMPT=1 -e OTEL_COLLECTOR_ENDPOINT=http://host.docker.internal:4317 -e FEATURE_FLAG_CLIENT= -e AIRBYTE_VERSION=0.44.5 -e WORKER_JOB_ID=2 atwong/destination-starrocks:latest write --config destination_config.json --catalog destination_catalog.json
2023-06-06 18:30:40 INFO i.a.w.i.VersionedAirbyteMessageBufferedWriterFactory(createWriter):41 - Writing messages to protocol version 0.2.0
2023-06-06 18:30:40 INFO i.a.w.i.VersionedAirbyteStreamFactory(create):181 - Reading messages from protocol version 0.2.0
2023-06-06 18:30:40 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:30:40 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0'
2023-06-06 18:30:40 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1'
2023-06-06 18:30:40 INFO i.a.c.EnvConfigs(getEnvOrDefault):1222 - Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1'
liuzhongjun89 commented 1 year ago

Total records read: 0 (0 bytes) 2023-06-06 18:30:32 ?[32mINFO?[m i.a.w.i.FieldSelector(reportMetrics):122 - Schema validation was performed to a max of 10 > records with errors per stream. ... 2023-06-06 18:30:34 ?[32mINFO?[m i.a.w.g.DefaultReplicationWorker(getReplicationOutput):450 - failures: [ { "failureOrigin" : "source", "internalMessage" : "Source process exited with non-zero exit code 137", "externalMessage" : "Something went wrong within the source connector", "metadata" : { "attemptNumber" : 0, "jobId" : 2, "connector_command" : "read" }, "stacktrace" : "io.airbyte.workers.internal.exception.SourceException: Source process exited with non-zero exit code 137\n\tat io.airbyte.workers.general.DefaultReplicationWorker.lambda$readFromSrcAndWriteToDstRunnable$5(DefaultReplicationWorker.java:379)\n\tat java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)\n\tat java.base/java.lang.Thread.run(Thread.java:1589)\n", "timestamp" : 1686076232935 } ]

Could you check your source config or source URL, Seem like source connector process was killed? mostly because of OOM?

alberttwong commented 1 year ago

The source URL seems to be okay. I can access https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2023-01.parquet just fine. Also I documented my steps at https://github.com/StarRocks/starrocks/discussions/23713

  "metadata" : {
    "attemptNumber" : 1,
    "jobId" : 5,
    "connector_command" : "read"
  },
  "stacktrace" : "io.airbyte.workers.internal.exception.SourceException: Source process exited with non-zero exit code 137\n\tat io.airbyte.workers.general.DefaultReplicationWorker.lambda$readFromSrcAndWriteToDstRunnable$5(DefaultReplicationWorker.java:379)\n\tat java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)\n\tat java.base/java.lang.Thread.run(Thread.java:1589)\n",
  "timestamp" : 1686246768331
} ]
2023-06-08 17:52:49 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-08 17:52:49 INFO i.a.c.i.LineGobbler(voidCall):149 - ----- END REPLICATION -----
2023-06-08 17:52:49 INFO i.a.c.i.LineGobbler(voidCall):149 - 
2023-06-08 17:52:49 INFO i.a.w.t.TemporalAttemptExecution(get):163 - Stopping cancellation check scheduling...
2023-06-08 17:52:49 INFO i.a.w.t.s.ReplicationActivityImpl(lambda$replicate$3):159 - sync summary: io.airbyte.config.StandardSyncOutput@2e2b0110[standardSyncSummary=io.airbyte.config.StandardSyncSummary@796f1f60[status=failed,recordsSynced=0,bytesSynced=0,startTime=1686246749880,endTime=1686246769528,totalStats=io.airbyte.config.SyncStats@68efefa0[bytesCommitted=0,bytesEmitted=0,destinationStateMessagesEmitted=0,destinationWriteEndTime=1686246769527,destinationWriteStartTime=1686246749961,estimatedBytes=<null>,estimatedRecords=<null>,meanSecondsBeforeSourceStateMessageEmitted=0,maxSecondsBeforeSourceStateMessageEmitted=0,maxSecondsBetweenStateMessageEmittedandCommitted=0,meanSecondsBetweenStateMessageEmittedandCommitted=0,recordsEmitted=0,recordsCommitted=0,replicationEndTime=1686246769528,replicationStartTime=1686246749880,sourceReadEndTime=1686246768315,sourceReadStartTime=1686246749918,sourceStateMessagesEmitted=0,additionalProperties={}],streamStats=[],additionalProperties={}],normalizationSummary=<null>,webhookOperationSummary=<null>,state=<null>,outputCatalog=io.airbyte.protocol.models.ConfiguredAirbyteCatalog@4663685c[streams=[io.airbyte.protocol.models.ConfiguredAirbyteStream@30b6d201[stream=io.airbyte.protocol.models.AirbyteStream@31235f68[name=nyc,jsonSchema={"$schema":"http://json-schema.org/draft-07/schema#","type":"object","properties":{"DOLocationID":{"type":["number","null"]},"RatecodeID":{"type":["number","null"]},"fare_amount":{"type":["number","null"]},"congestion_surcharge":{"type":["number","null"]},"tpep_dropoff_datetime":{"format":"date-time","type":["string","null"]},"VendorID":{"type":["number","null"]},"passenger_count":{"type":["number","null"]},"tolls_amount":{"type":["number","null"]},"improvement_surcharge":{"type":["number","null"]},"trip_distance":{"type":["number","null"]},"payment_type":{"type":["number","null"]},"store_and_fwd_flag":{"type":["string","null"]},"total_amount":{"type":["number","null"]},"extra":{"type":["number","null"]},"tip_amount":{"type":["number","null"]},"mta_tax":{"type":["number","null"]},"airport_fee":{"type":["number","null"]},"PULocationID":{"type":["number","null"]},"tpep_pickup_datetime":{"format":"date-time","type":["string","null"]}}},supportedSyncModes=[full_refresh],sourceDefinedCursor=<null>,defaultCursorField=[],sourceDefinedPrimaryKey=[],namespace=<null>,additionalProperties={}],syncMode=full_refresh,cursorField=[],destinationSyncMode=overwrite,primaryKey=[],additionalProperties={}]],additionalProperties={}],failures=[io.airbyte.config.FailureReason@26262403[failureOrigin=source,failureType=<null>,internalMessage=Source process exited with non-zero exit code 137,externalMessage=Something went wrong within the source connector,metadata=io.airbyte.config.Metadata@7bda9913[additionalProperties={attemptNumber=1, jobId=5, connector_command=read}],stacktrace=io.airbyte.workers.internal.exception.SourceException: Source process exited with non-zero exit code 137
    at io.airbyte.workers.general.DefaultReplicationWorker.lambda$readFromSrcAndWriteToDstRunnable$5(DefaultReplicationWorker.java:379)
    at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    at java.base/java.lang.Thread.run(Thread.java:1589)
,retryable=<null>,timestamp=1686246768331,additionalProperties={}]],commitStateAsap=true,additionalProperties={}]
2023-06-08 17:52:49 INFO i.a.w.t.s.ReplicationActivityImpl(lambda$replicate$3):164 - Sync summary length: 3459
alberttwong commented 1 year ago

ahh.. the parquet file is too big! I tried a 11meg parquet file and worked! https://d37ci6vzurychx.cloudfront.net/trip-data/green_tripdata_2023-01.parquet