airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.94k stars 4.09k forks source link

Source Oracle - Memory Leak #27846

Open philippeboyd opened 1 year ago

philippeboyd commented 1 year ago

Connector Name

source-oracle

Connector Version

0.4.0

What step the error happened?

During the sync

Revelant information

There seems to be a big memory leak with the source-oracle connector. Vertical scaling doesn't help.

I sync about ~1.5 GB of data from difference Oracle sources and the source-oracle-read container can take from 9GB up to 18Gb of RAM sometime.

There's no way that syncing 1.5GB of data takes between 10 and 18GB of RAM in a java process.

Screenshot 2023-06-27 at 10 38 10

Might be linked to the other memory leak issue I opened #27844

marcosmarxm commented 1 year ago

@prateekmukhedkar this is something of knowledge of the connectors team?

philippeboyd commented 1 year ago

@marcosmarxm @prateekmukhedkar any news on this? We're seeing ram explode for syncing 1GB worth of data image

prateekmukhedkar commented 1 year ago

The amount of memory used will depend on the schema of the tables you have selected for replication. We have also realized that reading all the data from a table in one single read transaction is not the most efficient use of resources, and instead we will be reading data in smaller chunks. We have not scoped out these improvements for source-oracle at present.

At this point I'll suggest creating a view, limiting data from last X days and/or including certain columns from your source table. In the replication tab, configure a sync for the view you created. Please keep us updated on how things proceed.

philippeboyd commented 1 year ago

@prateekmukhedkar your answer doesn't make any sense. How can you explain syncing a database with 1-2GB worth of data can take 30-40GB worth of memory in the source-oracle-read container? it doesn't matter if it's a single read transaction or not.

At this point I'll suggest creating a view, limiting data from last X days and/or including certain columns from your source table. In the replication tab, configure a sync for the view you created. Please keep us updated on how things proceed.

Can't do that, I do not have write access to the source.

Here are the logs for another Oracle Source sync I did which totals up to 613.83 MB | 536,041 emitted records. The Source Oracle Read container ate up close to 9GB of RAM

2023-07-20 15:18:16 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (a09a7) -- Batch contains: 7982 records, 10.19 MB bytes.
2023-07-20 15:18:24 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (e19a1) -- Batch contains: 17717 records, 25.52 MB bytes.
2023-07-20 15:18:29 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (3f542) -- Batch contains: 12066 records, 11.13 MB bytes.
2023-07-20 15:18:33 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (87e8b) -- Batch contains: 20784 records, 19.81 MB bytes.
2023-07-20 15:18:39 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (c23cc) -- Batch contains: 33621 records, 30.78 MB bytes.
2023-07-20 15:18:52 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (e4982) -- Batch contains: 16505 records, 10.33 MB bytes.
2023-07-20 15:18:57 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (d980c) -- Batch contains: 39254 records, 24.53 MB bytes.
2023-07-20 15:19:02 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (4de90) -- Batch contains: 4169 records, 12.95 MB bytes.
2023-07-20 15:19:06 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (4dea8) -- Batch contains: 5539 records, 17.61 MB bytes.
2023-07-20 15:19:12 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (4034a) -- Batch contains: 7101 records, 22.84 MB bytes.
2023-07-20 15:19:20 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (87f4d) -- Batch contains: 10051 records, 37.62 MB bytes.
2023-07-20 15:19:27 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (c0482) -- Batch contains: 2625 records, 10.83 MB bytes.
2023-07-20 15:19:31 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (e81ef) -- Batch contains: 9687 records, 13.39 MB bytes.
2023-07-20 15:19:35 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (71d60) -- Batch contains: 16034 records, 26.22 MB bytes.
2023-07-20 15:19:42 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (b6223) -- Batch contains: 12334 records, 20.72 MB bytes.
2023-07-20 15:19:44 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (46b40) -- Batch contains: 32852 records, 13.54 MB bytes.
2023-07-20 15:19:49 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (71262) -- Batch contains: 5661 records, 12.26 MB bytes.
2023-07-20 15:19:54 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (02d3e) -- Batch contains: 6222 records, 10.35 MB bytes.
2023-07-20 15:19:57 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (00c20) -- Batch contains: 7518 records, 12.12 MB bytes.
2023-07-20 15:20:02 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (d9470) -- Batch contains: 14716 records, 23.62 MB bytes.
2023-07-20 15:20:09 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (0c8e8) -- Batch contains: 20486 records, 29.38 MB bytes.
2023-07-20 15:20:18 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (75925) -- Batch contains: 8363 records, 10.16 MB bytes.
2023-07-20 15:20:24 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (eb16f) -- Batch contains: 11304 records, 13.69 MB bytes.
2023-07-20 15:20:29 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (80c53) -- Batch contains: 19545 records, 24.79 MB bytes.
2023-07-20 15:20:35 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (8c812) -- Batch contains: 8556 records, 11.72 MB bytes.
2023-07-20 15:20:36 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (80edf) -- Batch contains: 11774 records, 15.65 MB bytes.
2023-07-20 15:20:39 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (27d63) -- Batch contains: 14873 records, 20.91 MB bytes.
2023-07-20 15:20:45 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (44052) -- Batch contains: 23602 records, 34.6 MB bytes.
2023-07-20 15:20:53 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (0a3d2) -- Batch contains: 24016 records, 35.26 MB bytes.
2023-07-20 15:21:00 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (549ab) -- Batch contains: 10157 records, 14.92 MB bytes.
2023-07-20 15:21:05 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (7f495) -- Batch contains: 18098 records, 26.57 MB bytes.
2023-07-20 15:21:11 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (b97f6) -- Batch contains: 25914 records, 37.7 MB bytes.
2023-07-20 15:21:16 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (5ca6c) -- Batch contains: 2081 records, 8.43 MB bytes.
2023-07-20 15:21:16 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (8b09f) -- Batch contains: 2430 records, 6.91 MB bytes.
2023-07-20 15:21:16 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (c7525) -- Batch contains: 6473 records, 9.65 MB bytes.
2023-07-20 15:21:16 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (22a78) -- Batch contains: 7910 records, 7.13 MB bytes.
2023-07-20 15:21:19 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (f8e9a) -- Batch contains: 10438 records, 15.05 MB bytes.
2023-07-20 15:21:20 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (2f1a1) -- Batch contains: 2682 records, 1.57 MB bytes.
2023-07-20 15:21:20 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (58c1d) -- Batch contains: 5512 records, 2.43 MB bytes.
2023-07-20 15:21:20 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (c7b99) -- Batch contains: 9771 records, 4.08 MB bytes.
2023-07-20 15:21:20 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (76dd2) -- Batch contains: 4731 records, 5.56 MB bytes.
2023-07-20 15:21:23 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (2f07b) -- Batch contains: 1288 records, 1.29 MB bytes.
2023-07-20 15:21:23 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (87df6) -- Batch contains: 1042 records, 1.17 MB bytes.
2023-07-20 15:21:23 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (11361) -- Batch contains: 439 records, 266.74 KB bytes.
2023-07-20 15:21:23 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (21c9a) -- Batch contains: 475 records, 744.23 KB bytes.
2023-07-20 15:21:25 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (dc833) -- Batch contains: 427 records, 140.89 KB bytes.
2023-07-20 15:21:25 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (baeca) -- Batch contains: 185 records, 231.05 KB bytes.
2023-07-20 15:21:25 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (85198) -- Batch contains: 326 records, 132.75 KB bytes.
2023-07-20 15:21:26 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (e8e74) -- Batch contains: 274 records, 106.14 KB bytes.
2023-07-20 15:21:27 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (48494) -- Batch contains: 79 records, 26.04 KB bytes.
2023-07-20 15:21:27 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (b7128) -- Batch contains: 34 records, 52.53 KB bytes.
2023-07-20 15:21:27 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (03e3b) -- Batch contains: 123 records, 37.73 KB bytes.
2023-07-20 15:21:27 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (6a7da) -- Batch contains: 69 records, 21.15 KB bytes.
2023-07-20 15:21:28 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (7cc73) -- Batch contains: 36 records, 14.27 KB bytes.
2023-07-20 15:21:29 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (4e2b1) -- Batch contains: 24 records, 10.04 KB bytes.
2023-07-20 15:21:29 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (f5720) -- Batch contains: 22 records, 8.09 KB bytes.
2023-07-20 15:21:29 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (a54f1) -- Batch contains: 1 records, 5.53 KB bytes.
2023-07-20 15:21:29 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (3a6fd) -- Batch contains: 11 records, 3.27 KB bytes.
2023-07-20 15:21:30 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (08a4e) -- Batch contains: 7 records, 3.18 KB bytes.
2023-07-20 15:21:31 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (594cf) -- Batch contains: 6 records, 1.53 KB bytes.
2023-07-20 15:21:31 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (85602) -- Batch contains: 7 records, 1.65 KB bytes.
2023-07-20 15:21:31 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (a6fc8) -- Batch contains: 3 records, 1.11 KB bytes.
2023-07-20 15:21:31 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (3555e) -- Batch contains: 1 records, 1.09 KB bytes.
2023-07-20 15:21:32 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (8ba16) -- Batch contains: 1 records, 1005 bytes bytes.
2023-07-20 15:21:33 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (6d6ca) -- Batch contains: 2 records, 742 bytes bytes.
2023-07-20 15:21:33 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (e10b3) -- Batch contains: 2 records, 656 bytes bytes.
2023-07-20 15:21:43 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (2f011) -- Batch contains: 1 records, 546 bytes bytes.
2023-07-20 15:21:43 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (cb50e) -- Batch contains: 1 records, 343 bytes bytes.
2023-07-20 15:21:44 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):155 Flush Worker (77dd2) -- Batch contains: 1 records, 287 bytes bytes.

image

CONTAINER ID   NAME                               CPU %     MEM USAGE / LIMIT     MEM %     NET I/O   BLOCK I/O   PIDS
2448b1ad10c2   source-oracle-read-15172-0-cvzyz   69.13%    11.47GiB / 31.36GiB   36.58%    0B / 0B   0B / 0B     31
philippeboyd commented 1 year ago

We're not facing this issue with other types of JDBC connectors such as MSSQL or DB2.

I'm trying to figure out what would be causing this in the source code of the Oracle source-connector. But nothing seems to attract my attention as it seems super simple.

Could it be something related to the old dependency com.oracle.database.jdbc:ojdbc8-production:19.7.0.0 ?

prateekmukhedkar commented 1 year ago

@philippeboyd I agree with your observation about the memory usage compared to the volume of data synced. At this point I am unable to point to a root cause or whether it is caused by an older dependency. You can try to upgrade the dependency, recompile the connector and see if it improves the memory usage. If you are willing to do that then I will provide instructions on how to develop and test connectors locally.

philippeboyd commented 1 year ago

Hi @prateekmukhedkar, I indeed tried to update the old OJDBC dependency to latest 21.10.0.0 but I'm facing the same issue. I even tried using OJDBC11; it doesn't change a thing.

philippeboyd commented 1 year ago

@prateekmukhedkar After going down the rabbit hole, I would like to have a discussion over this particular line in the class TwoStageSizeEstimator

By doing getTargetBufferByteSize(Runtime.getRuntime().maxMemory()), we're essentially telling the connector that all available RAM is open bar for whatever it needs to do...

When I change the code to use getTargetBufferByteSize(null) it forces the Estimator's getTargetBufferByteSize() to return MIN_BUFFER_BYTE_SIZE = 250L * 1024L * 1024L; // 250 MB which helps a LOT with the memory usage without affecting the sync's performance.

Instead of using 15-20GB; it uses 2-3GB.

git blame is suggesting me to loop in @tuliren

philippeboyd commented 10 months ago

Any news one this? I'm not even syncing 3GB of data... This memory leak on Source Oracle Read is not normal.

image

I can't even scale vertically, the connector takes all the ram and caps everything image