airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.91k stars 4.09k forks source link

Too many open files exception #685

Closed sherifnada closed 3 years ago

sherifnada commented 3 years ago

Expected Behavior

If I create a sync from Postgres to BigQuery that runs on a 15 minute cadence, I expect it to continuously run without issue.

Current Behavior

After a user set up a standard sync from Postgres to BigQuery on AWS following the Airbyte AWS guide, the sync completed fine the first 9 out of 10 times. On the 10th time, they are seeing this issue with the following exception:

2020-10-19 18:36:33 INFO (/tmp/workspace/15/4) WorkerRun(call):58 - Executing worker wrapper...
2020-10-19 18:36:33 ERROR (/tmp/workspace/15/4) SingerSyncWorker(run):86 - Sync worker failed.
java.lang.RuntimeException: java.nio.file.FileSystemException: /tmp/workspace/15/4/target_config.json: Too many open files
at io.airbyte.commons.io.IOs.writeFile(IOs.java:50) ~[airbyte-commons-0.1.0-alpha.jar:?]
at io.airbyte.workers.protocols.singer.DefaultSingerTarget.start(DefaultSingerTarget.java:69) ~[airbyte-workers-0.1.0-alpha.jar:?]
at io.airbyte.workers.protocols.singer.SingerSyncWorker.run(SingerSyncWorker.java:71) [airbyte-workers-0.1.0-alpha.jar:?]
...

attached to the airbyte-server container, lsof isn't available, find /tmp/workspace returns 173 files. On the EC2 host, lsof returns 999 files, almost 89k if run with sudo.

Steps to Reproduce

In progress

Severity of the bug for you

High

Context

airbyte version v0.1.0-alpha

cgardens commented 3 years ago

If we can't recreate this anymore, should we go ahead and close?

cgardens commented 3 years ago

Going to close this as we haven't been able to recreate it since we rewrote our destinations to not use singer.