airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
14.72k stars 3.78k forks source link

[worker] service high CPU usage #35663

Open aleksa-wholesome opened 4 months ago

aleksa-wholesome commented 4 months ago

Platform Version

0.50.51

What step the error happened?

During the Sync

Revelant information

We are running OSS version on e2-highmem-8 machine in GCP. We are facing problems since we have updated to v0.50.51, namely the airbyte-worker container is hogging CPU resources and it's scaling with the number of syncs running. If I run two syncs at the same time the usage goes up to 400%, we also see our CPU locked at 100% usage in the morning where we ran most of our syncs, while most syncs look to be in pending state. This seems like a bug, I have tried setting JOB_MAIN_CONTAINER_CPU_LIMIT variable but that does not seem to do anything to lower the CPU usage of the airbyte-worker I have seen this thread already: https://airbytehq.slack.com/archives/C021JANJ6TY/p1706901124594629, so I am not the only one seeing this behavior. I was also able to replicate this on my local machine (Ryzen 7 5800x), with a fresh install running just 3 connections CPU usage is 600%.

Also it does not seem to be due to any particular source, it happens when I ran any source that has to run for more than couple of seconds (which is most of them).

Screenshot from 2024-02-26 23-49-17

Relevant log output

No response

zerobearing2 commented 3 weeks ago

Having same issue with amazon seller connection, 1 running eats 200% CPU. Did you find any resolution?