The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
If I create a sync from Postgres to BigQuery that runs on a 15 minute cadence, I expect it to continuously run without issue.
Current Behavior
After a user set up a standard sync from Postgres to BigQuery on AWS following the Airbyte AWS guide, the sync completed fine the first 9 out of 10 times. On the 10th time, they are seeing this issue with the following exception:
2020-10-19 18:36:33 INFO (/tmp/workspace/15/4) WorkerRun(call):58 - Executing worker wrapper...
2020-10-19 18:36:33 ERROR (/tmp/workspace/15/4) SingerSyncWorker(run):86 - Sync worker failed.
java.lang.RuntimeException: java.nio.file.FileSystemException: /tmp/workspace/15/4/target_config.json: Too many open files
at io.airbyte.commons.io.IOs.writeFile(IOs.java:50) ~[airbyte-commons-0.1.0-alpha.jar:?]
at io.airbyte.workers.protocols.singer.DefaultSingerTarget.start(DefaultSingerTarget.java:69) ~[airbyte-workers-0.1.0-alpha.jar:?]
at io.airbyte.workers.protocols.singer.SingerSyncWorker.run(SingerSyncWorker.java:71) [airbyte-workers-0.1.0-alpha.jar:?]
...
attached to the airbyte-server container, lsof isn't available, find /tmp/workspace returns 173 files. On the EC2 host, lsof returns 999 files, almost 89k if run with sudo.
Expected Behavior
If I create a sync from Postgres to BigQuery that runs on a 15 minute cadence, I expect it to continuously run without issue.
Current Behavior
After a user set up a standard sync from Postgres to BigQuery on AWS following the Airbyte AWS guide, the sync completed fine the first 9 out of 10 times. On the 10th time, they are seeing this issue with the following exception:
attached to the airbyte-server container,
lsof
isn't available,find /tmp/workspace
returns 173 files. On the EC2 host,lsof
returns 999 files, almost 89k if run withsudo
.Steps to Reproduce
In progress
Severity of the bug for you
High
Context
airbyte version v0.1.0-alpha