airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.77k stars 4.04k forks source link

Airbyte Server Memory leak / Have a way to specify max Memory usage in GB and not percentage #27844

Open philippeboyd opened 1 year ago

philippeboyd commented 1 year ago

What method are you using to run Airbyte?

Docker

Platform Version or Helm Chart Version

0.50.5

What step the error happened?

During the Sync

Revelant information

Having the flag -XX:MaxRAMPercentage=75.0 for every component (even connectors) is not logical in my opinion and can lead to Memory capping thus crashing everything.

For instance, if the Server is using 75% of the ram, and a connector is set to also cap at 75% or ram, those java processes in different containers do not talk to each other. As far as they know, they're the only ones on the VM and they can ALL cap at 75% of the available ram. So you essentially add those 75% MaxRAMPercentage.

❯ grep -r MaxRAMPercentage .
./airbyte-metrics/reporter/build.gradle:    applicationDefaultJvmArgs = ['-XX:+ExitOnOutOfMemoryError', '-XX:MaxRAMPercentage=75.0']
./airbyte-bootloader/build.gradle:    applicationDefaultJvmArgs = ['-XX:+ExitOnOutOfMemoryError', '-XX:MaxRAMPercentage=75.0']
./airbyte-cron/build.gradle:    applicationDefaultJvmArgs = ['-XX:+ExitOnOutOfMemoryError', '-XX:MaxRAMPercentage=75.0']
./airbyte-container-orchestrator/build.gradle:    applicationDefaultJvmArgs = ['-XX:+ExitOnOutOfMemoryError', '-XX:MaxRAMPercentage=75.0']
./airbyte-workers/build.gradle:    applicationDefaultJvmArgs = ['-XX:+ExitOnOutOfMemoryError', '-XX:MaxRAMPercentage=75.0']
./airbyte-server/build.gradle:    applicationDefaultJvmArgs = ['-XX:+ExitOnOutOfMemoryError', '-XX:MaxRAMPercentage=75.0']
./airbyte-connector-atelier-server/build.gradle:    applicationDefaultJvmArgs = ['-XX:+ExitOnOutOfMemoryError', '-XX:MaxRAMPercentage=75.0']

There's no valid reason for the server to use 17GB (out of 32) in my case

CONTAINER ID   NAME                                        CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
da12bf1411a3   airbyte-proxy                               0.00%     7.84MiB / 31.36GiB    0.02%     33.1MB / 33.2MB   13.1MB / 8.19kB   3
0ed2297314d2   airbyte-webapp                              0.00%     10.95MiB / 31.36GiB   0.03%     25MB / 32.9MB     12.5MB / 8.19kB   5
a6c5fb7780ee   airbyte-cron                                0.10%     793.6MiB / 31.36GiB   2.47%     124MB / 863kB     203MB / 197kB     48
f639ae929917   airbyte-worker                              6.29%     1.333GiB / 31.36GiB   4.25%     20.6MB / 8.12MB   142MB / 213kB     197
0ac72827b0b9   airbyte-server                              0.14%     17GiB / 31.36GiB      54.19%    3.72GB / 51.7MB   188MB / 344MB     102
0fb6dea25c53   airbyte-connector-builder-server            0.10%     231.1MiB / 31.36GiB   0.72%     1.59kB / 168B     47.2MB / 98.3kB   37
86ac05b3ab0f   airbyte-db                                  0.18%     140.5MiB / 31.36GiB   0.44%     464MB / 4.07GB    84.6MB / 138MB    39
5831e193e437   airbyte-temporal                            1.36%     122.8MiB / 31.36GiB   0.38%     456MB / 567MB     134MB / 8.19kB    13

After a few more syncs, here's the airbyte-server container Ram usage

image

Relevant log output

No response

marcosmarxm commented 1 year ago

@philippeboyd where are you running the server and what OS? Maybe this can give some hint about what is causing this.

philippeboyd commented 1 year ago

@marcosmarxm it should not be really relevant since the java processes runs in it's own JVM but I'm running Airbyte with Docker Compose on a GCP COS VM

gustavohwulee commented 1 year ago

Same problem Docker Compose, EC2, Airbyte 0.50.15 image

Auric-Manteo commented 1 year ago

Same problem here, Airbyte 0.50.27. Airbyte is taking up more than 20G RAM, even when not running any sync. @gustavohwulee , @philippeboyd I started experimenting with the configs and now got it down to <6G. I still need to pinpoint which config is necessary and would appreciate your collaboration, but for now these are my changes. I hope that helps your case!

In .env I set JOB_MAIN_CONTAINER_MEMORY_LIMIT to 2GB, NORMALIZATION_JOB_MAIN_CONTAINER_MEMORY_LIMIT to 1GB, added a memory limit to the docker compose services workers of 2GB and for the server of 5GB.

Make sure to test these settings by restarting Airbyte and running your syncs so you can rest assured this is enough for you!

@marcosmarxm It would be great if you could make a recommendation for better limits based on your experience! It feels like 1GB for a normalization job is overly pessimistic :D

Note that in the screenshot I edited the docker-compose.yml file directly instead of using environment variables, which would have been better :D image image

Auric-Manteo commented 1 year ago

The mem_limit option is ignored by docker-compose V3 since it's an option only supported by V2, so the options that produced the effect were the changed environment variables alone! That seems like a great configuration then :)

If you still like to limit the docker service you need to do so by using the new V3 format: https://docs.docker.com/compose/compose-file/compose-file-v3/#resources