mountpoint-s3 has a high memory usage, resulting in an Out of memory: Kill process error in the analysis software diamond

sunl commented 11 months ago

Mountpoint for Amazon S3 version

mount-s3 v1.0.2

AWS Region

cn-northwest-1

Describe the running environment

Running in EC2(c6i.4xlarge) on Centos7 against an S3 Bucket in the same account.

What happened?

My testing scenario is using diamond blast for genomics analysis, the command used are as follows:

docker run -v /data:/data 
-v /database:/data/db/RD/Metagenome/database \
--privileged --log-opt max-size=10m --log-opt max-file=3 \
dockerimage diamond blastp --threads 16 -b 3 -q /data/gene_catalogue.faa.${index} \
-d /data/db/RD/Metagenome/database/nr.dmnd -o /data/Unigenes_vs_nr_blt.txt.${index} \
--max-target-seqs 10 --evalue 0.0001 --outfmt 6 qseqid qlen sseqid stitle slen \
pident length mismatch gapopen qstart qend sstart send evalue bitscore

Among them, nr.dmnd is the reference database that the analysis relies on, stored in the S3 bucket. The S3 bucket is mounted to the /database directory on EC2 with mount-s3.

One EC2 runs one task to process one sample, each task needs to run for approximately 6 hours. I simultaneously started 20 EC2 runs 20 tasks (index from 1 to 20). Among the 20 tasks, 2-3 out of 20 tasks often fail after running for over about 5 hours, and the task index that fail each time are not fixed. The following error message is observed:

Oct 11 15:28:26 ip-10-0-0-13 kernel: Out of memory: Kill process 3484 (diamond) score 750 or sacrifice child
Oct 11 15:28:27 ip-10-0-0-13 kernel: Out of memory: Kill process 3722 (diamond) score 750 or sacrifice child

Then I replaced mount-s3 with goofys, and all 20 tasks were successful every time, but the analysis time will be about 8% -10% longer than using mount-s3.

I suspect that this issue is caused by the high memory usage of mount s3. I am not sure if you have any suggestions or if you can optimize the memory usage of mount s3 in the future?

Relevant log output

No response

sauraank commented 11 months ago

Hi! Can you please attach log outputs with adding the Logging options --debug and --log-metrics with mount-s3. This will help identifying the issue, if there is any, which is causing high memory usage?

sunl commented 11 months ago

@sauraank Hi Ankit, the last part of the log outputs :

2023-10-27T10:38:31.992847Z DEBUG read{req=2593454 ino=3 fh=2 offset=164685119488 size=131072}:prefetch{range=165758894080..167906377728}:get_object{id=509 bucket="sun-magigene-bash" key="db/NR/nr_V20211219.dmnd" range=Some(165758894080..167906377728) if_match=Some(ETag { etag: "\"43f7ffe4cc841d6a5028111316709772-3679\"" })}: mountpoint_s3_client::s3_crt_client: request finished request_type=GetObject http_status=206 range=Some(167830880256..167839268864) duration=619.674882ms ttfb=Some(19.627978ms) request_id=C9H8RCRM21FJX8T8
2023-10-27T10:38:31.997260Z DEBUG read{req=2593454 ino=3 fh=2 offset=164685119488 size=131072}:prefetch{range=165758894080..167906377728}:get_object{id=509 bucket="sun-magigene-bash" key="db/NR/nr_V20211219.dmnd" range=Some(165758894080..167906377728) if_match=Some(ETag { etag: "\"43f7ffe4cc841d6a5028111316709772-3679\"" })}: mountpoint_s3_client::s3_crt_client: request finished request_type=GetObject http_status=206 range=Some(167839268864..167847657472) duration=624.054982ms ttfb=Some(16.989013ms) request_id=C9HBFDB0JRKCQJ7J
2023-10-27T10:38:32.001599Z DEBUG read{req=2593454 ino=3 fh=2 offset=164685119488 size=131072}:prefetch{range=165758894080..167906377728}:get_object{id=509 bucket="sun-magigene-bash" key="db/NR/nr_V20211219.dmnd" range=Some(165758894080..167906377728) if_match=Some(ETag { etag: "\"43f7ffe4cc841d6a5028111316709772-3679\"" })}: mountpoint_s3_client::s3_crt_client: request finished request_type=GetObject http_status=206 range=Some(167847657472..167856046080) duration=628.365369ms ttfb=Some(18.690246ms) request_id=C9H0AH38QZ7GNV6B
2023-10-27T10:38:32.005927Z DEBUG read{req=2593454 ino=3 fh=2 offset=164685119488 size=131072}:prefetch{range=165758894080..167906377728}:get_object{id=509 bucket="sun-magigene-bash" key="db/NR/nr_V20211219.dmnd" range=Some(165758894080..167906377728) if_match=Some(ETag { etag: "\"43f7ffe4cc841d6a5028111316709772-3679\"" })}: mountpoint_s3_client::s3_crt_client: request finished request_type=GetObject http_status=206 range=Some(167856046080..167864434688) duration=632.660478ms ttfb=Some(17.289167ms) request_id=C9H26SERB8YD9D1F
2023-10-27T10:38:32.011292Z DEBUG read{req=2593454 ino=3 fh=2 offset=164685119488 size=131072}:prefetch{range=165758894080..167906377728}:get_object{id=509 bucket="sun-magigene-bash" key="db/NR/nr_V20211219.dmnd" range=Some(165758894080..167906377728) if_match=Some(ETag { etag: "\"43f7ffe4cc841d6a5028111316709772-3679\"" })}: mountpoint_s3_client::s3_crt_client: request finished request_type=GetObject http_status=206 range=Some(167864434688..167872823296) duration=637.991723ms ttfb=Some(17.854621ms) request_id=C9H5K90Q1VGR86T5
2023-10-27T10:38:32.015615Z DEBUG read{req=2593454 ino=3 fh=2 offset=164685119488 size=131072}:prefetch{range=165758894080..167906377728}:get_object{id=509 bucket="sun-magigene-bash" key="db/NR/nr_V20211219.dmnd" range=Some(165758894080..167906377728) if_match=Some(ETag { etag: "\"43f7ffe4cc841d6a5028111316709772-3679\"" })}: mountpoint_s3_client::s3_crt_client: request finished request_type=GetObject http_status=206 range=Some(167872823296..167881211904) duration=642.287694ms ttfb=Some(16.93879ms) request_id=C9H8S69R023Z10DK
2023-10-27T10:38:32.019988Z DEBUG read{req=2593454 ino=3 fh=2 offset=164685119488 size=131072}:prefetch{range=165758894080..167906377728}:get_object{id=509 bucket="sun-magigene-bash" key="db/NR/nr_V20211219.dmnd" range=Some(165758894080..167906377728) if_match=Some(ETag { etag: "\"43f7ffe4cc841d6a5028111316709772-3679\"" })}: mountpoint_s3_client::s3_crt_client: request finished request_type=GetObject http_status=206 range=Some(167881211904..167889600512) duration=646.631275ms ttfb=Some(18.783757ms) request_id=C9H3E9R0VBB3ZM3G
2023-10-27T10:38:32.024377Z DEBUG read{req=2593454 ino=3 fh=2 offset=164685119488 size=131072}:prefetch{range=165758894080..167906377728}:get_object{id=509 bucket="sun-magigene-bash" key="db/NR/nr_V20211219.dmnd" range=Some(165758894080..167906377728) if_match=Some(ETag { etag: "\"43f7ffe4cc841d6a5028111316709772-3679\"" })}: mountpoint_s3_client::s3_crt_client: request finished request_type=GetObject http_status=206 range=Some(167889600512..167897989120) duration=650.99241ms ttfb=Some(15.544991ms) request_id=C9HFRQX40ZKPJ0T0
2023-10-27T10:38:32.028770Z DEBUG read{req=2593454 ino=3 fh=2 offset=164685119488 size=131072}:prefetch{range=165758894080..167906377728}:get_object{id=509 bucket="sun-magigene-bash" key="db/NR/nr_V20211219.dmnd" range=Some(165758894080..167906377728) if_match=Some(ETag { etag: "\"43f7ffe4cc841d6a5028111316709772-3679\"" })}: mountpoint_s3_client::s3_crt_client: request finished request_type=GetObject http_status=206 range=Some(167897989120..167906377728) duration=655.355688ms ttfb=Some(18.587842ms) request_id=C9H9F4CHMNRFGE3D
2023-10-27T10:38:32.028845Z DEBUG read{req=2593454 ino=3 fh=2 offset=164685119488 size=131072}:prefetch{range=165758894080..167906377728}:get_object{id=509 bucket="sun-magigene-bash" key="db/NR/nr_V20211219.dmnd" range=Some(165758894080..167906377728) if_match=Some(ETag { etag: "\"43f7ffe4cc841d6a5028111316709772-3679\"" })}: mountpoint_s3_client::s3_crt_client: meta request finished duration=1.985284458s
2023-10-27T10:38:36.583334Z  INFO mountpoint_s3::metrics: fs.current_handles[type=read]: 1
2023-10-27T10:38:36.583371Z  INFO mountpoint_s3::metrics: fuse.op_latency_us[op=read]: n=1: min=1982464 p10=1990655 p50=1990655 avg=1986560.00 p90=1990655 p99=1990655 p99.9=1990655 max=1990655
2023-10-27T10:38:36.583373Z  INFO mountpoint_s3::metrics: s3.meta_requests.throughput_mibs[op=get_object,size=>16MiB]: n=1: min=1024 p10=1031 p50=1031 avg=1028.00 p90=1031 p99=1031 p99.9=1031 max=1031
2023-10-27T10:38:36.583375Z  INFO mountpoint_s3::metrics: s3.meta_requests.total_latency_us[op=get_object]: n=1: min=1982464 p10=1990655 p50=1990655 avg=1986560.00 p90=1990655 p99=1990655 p99.9=1990655 max=1990655
2023-10-27T10:38:36.583377Z  INFO mountpoint_s3::metrics: s3.meta_requests[op=get_object]: 1
2023-10-27T10:38:36.583379Z  INFO mountpoint_s3::metrics: s3.requests.first_byte_latency_us[op=get_object,type=GetObject]: n=101: min=14080 p10=15615 p50=18559 avg=18177.90 p90=20095 p99=23807 p99.9=23935 max=23935
2023-10-27T10:38:36.583381Z  INFO mountpoint_s3::metrics: s3.requests.total_latency_us[op=get_object,type=GetObject]: n=101: min=528384 p10=569343 p50=647167 avg=651000.40 p90=749567 p99=786431 p99.9=790527 max=790527
2023-10-27T10:38:36.583382Z  INFO mountpoint_s3::metrics: s3.requests[op=get_object,type=GetObject]: 101 (n=101)
2023-10-27T10:38:41.583494Z  INFO mountpoint_s3::metrics: fs.current_handles[type=read]: 1
2023-10-27T10:38:46.583647Z  INFO mountpoint_s3::metrics: fs.current_handles[type=read]: 1
2023-10-27T10:38:51.583797Z  INFO mountpoint_s3::metrics: fs.current_handles[type=read]: 1
2023-10-27T10:38:56.583944Z  INFO mountpoint_s3::metrics: fs.current_handles[type=read]: 1
2023-10-27T10:39:00.001990Z DEBUG fuser::request: FUSE(2597518) ino 0x0000000000000003 FLUSH fh FileHandle(2), lock owner LockOwner(11849907347463555948)    
2023-10-27T10:39:00.001987Z DEBUG fuser::request: FUSE(2597519) ino 0x0000000000000000 INTERRUPT unique RequestId(2597518)    
2023-10-27T10:39:00.519905Z DEBUG fuser::request: FUSE(2597520) ino 0x0000000000000003 RELEASE fh FileHandle(2), flags 0x8000, flush false, lock owner None    
2023-10-27T10:39:01.105542Z DEBUG fuser::request: FUSE(2597522) ino 0x0000000000000001 STATFS    
2023-10-27T10:39:01.584108Z  INFO mountpoint_s3::metrics: fuse.op_latency_us[op=flush]: n=1: min=160768 p10=161791 p50=161791 avg=161280.00 p90=161791 p99=161791 p99.9=161791 max=161791
2023-10-27T10:39:01.584129Z  INFO mountpoint_s3::metrics: fuse.op_latency_us[op=release]: n=1: min=116224 p10=116735 p50=116735 avg=116480.00 p90=116735 p99=116735 p99.9=116735 max=116735

and you can get the whole log file from the attachment: mountpoint-s3-2023-10-27T06-44-50Z.zip

dannycjones commented 11 months ago

Hey @sunl, I have three quick questions:

For the log above, was that for a mount which ran out of memory or one that allowed the task to complete successfully?
Was any swap memory configured? What is the total available memory expected?
What is the full file size?

dannycjones commented 11 months ago

I took a quick look over the logs. Here's some rough notes for others looking into this issue.

Workload appears to be opening a single file of at least 150GiB.
File handle count looks healthy from the metrics.
The last few FUSE operations look a little interesting - a FLUSH operation is sent for the file handle, but immediately followed by an INTERRUPT.
The logs do show a lot of seeking around the prefetch stream. This was only added in v1.1.0, but indicates the access pattern triggers a lot of out of order reads explored in #488.

dannycjones commented 11 months ago

My bad, I misunderstood part of the issue. It's the diamond application which is encountering the "Out of memory" error due to memory usage on the machine.

This explains some of the behaviors such as FUSE receiving an INTERRUPT message, since the application was interrupted due to OOM.

We recently made a fix to some prefetching behavior which would cause excessive data to be fetched from S3: #488. The fix was released in mount-s3 v1.1.0. @sunl, can you see if using the latest version solves the issue?

Separately, I am planning to dig into what other metrics may help us understand what's going on on the mountpoint-s3 side to help understand an issue like this much faster.

sunl commented 11 months ago

@dannycjones

the log file I uploaded was for a mount which ran out of memory and it was tested with mount-s3 v1.1.0.
the full file size is about 216GB
not configure swap memory, I think the total available memory of c6i.4xlarge/c5.4xlarge is enough, because if I use goofys to mount S3 bucket, the task complete successfully

by the way, I saw the new version released last week, so I upgraded it to v1.1.0 and did some test with the --debug --log-metrics and --log-directory options, I still got the same issue.

monthonk commented 11 months ago

Hi, @sunl

Agree that the memory in those instances should be enough to run your applications and we might want to dig deeper into this problem.

Next step for us is to reproduce the issue but it’s a bit hard to use diamond as we’re not familiar with the software. It would be great if you can provide a small script (probably python) to constantly reading data from mounted directory until we see the OOM problem. We will let you know as soon as we have any updates on this issue.

daltschu22 commented 9 months ago

Im experiencing a similar issue.

Running a find command against my mounted bucket causes the memory usage to steadily tick upwards and once the find is killed that memory usage is not released. We are not configuring any of the cache options.

dannycjones commented 9 months ago

Im experiencing a similar issue.

Running a find command against my mounted bucket causes the memory usage to steadily tick upwards and once the find is killed that memory usage is not released. We are not configuring any of the cache options.

Hey @daltschu22, do you mind opening a new bug report? This sounds like a case that will be easier to reproduce - details of the EC2 instance or machine being used, the number of S3 objects and how they are organized will be useful. The exact find command (redactions OK) would also be great.

anelson commented 9 months ago

Hi: I talked to Dev briefly at reInvent about this issue. I think my company are hitting this bug too. We can provide access to the bucket and a copy of our scanner binary that seems to cause the issue.

Per Dev's request I opened an AWS support ticket 14460192231 with more details.

anelson commented 8 months ago

Update: we can no longer reproduce this leak behavior using the latest release.

As we run the scan, we see memory usage creep upwards, but at the end of the scan it's on the order of 100MB used. That usage doesn't go away until we unmount, but it doesn't go higher than that. In earlier releases, it would climb uncontrolled until the system became unresponsive.

dannycjones commented 8 months ago

Update: we can no longer reproduce this leak behavior using the latest release.

As we run the scan, we see memory usage creep upwards, but at the end of the scan it's on the order of 100MB used. That usage doesn't go away until we unmount, but it doesn't go higher than that. In earlier releases, it would climb uncontrolled until the system became unresponsive.

That's brilliant news, thank you for sharing!

I hope it is broadly applicable, though I doubt it will have solved the issue @sunl is facing. The logs shared show only one file handle hanging around.

In the last release (v1.3.2), we released a change (https://github.com/awslabs/mountpoint-s3/commit/6e7252dd2e54932e277e5b5ee7000f9bc816a682) which fixed a potential memory leak due to unreleased file handles. This leak was reported in #670, and you may have been impacted if you saw lots of messages in the logs like "mountpoint_s3::fuse: release failed: unable to unwrap file handle reference". I'll share this in that particular issue so folks can try it out too.

anelson commented 8 months ago

In the last release (v1.3.2), we released a change (6e7252d) which fixed a potential memory leak due to unreleased file handles. This leak was reported in #670, and you may have been impacted if you saw lots of messages in the logs like "mountpoint_s3::fuse: release failed: unable to unwrap file handle reference". I'll share this in that particular issue so folks can try it out too.

I do recall seeing that message earlier in one of our test envs late last year (running an earlier version though I don't recall which one), but I didn't have time then to capture logs and look into it. I definitely did not see that when I was trying to reproduce this most recently.

Anyway, sorry to hijack @sunl's issue, I thought I was hitting the same bug. I'll shut up now ;)

awslabs / mountpoint-s3