Alluxio / alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud
https://www.alluxio.io
Apache License 2.0
6.86k stars 2.94k forks source link

After opening the client cache, the memory used will exceed the upper limit #17989

Open wangw-david opened 1 year ago

wangw-david commented 1 year ago

Alluxio Version: What version of Alluxio are you using? I deployed alluxio version 2.9.0 (alluxio/alluxio-dev:2.9.0) with fluid 0.9.1.

Describe the bug Here is my client cache configuration: alluxio.user.client.cache.enabled: "true" alluxio.user.client.cache.store.type: MEM alluxio.user.client.cache.size: 1GB alluxio.user.client.cache.page.size: 4MB

When I read and write 4M files, everything is normal, and then when I start to read and write 4K files, fuse's jvm memory often OOM. I use jmap -heap xxx to monitor the jvm usage of fuse: filesize num tenured generation used
4M 1024 948MB
2M 1024 1811MB
1M 1024 3643MB
256K 1024 4233MB
4K 1024 4164MB

When I turn off client caching, alluxio.user.client.cache.enabled: "false", the memory usage is only 23MB 1M 1024 23MB

If open the cache and set the pagesize to 64K, the memory usage will also become very small: 4K 1024 86MB

Can someone tell me why? Why is the small file taking up so much memory and seems to exceed the upper limit configuration.

To Reproduce Steps to reproduce the behavior (as minimally and precisely as possible)

Expected behavior A clear and concise description of what you expected to happen.

Urgency Describe the impact and urgency of the bug.

Are you planning to fix it Please indicate if you are already working on a PR.

Additional context Add any other context about the problem here.

LuQQiu commented 1 year ago

Thanks for reporting the issue. We have fixed some memory issues in Alluxio 3 which will be launching recently or can be used directly via alluxio main branch. Alluxio 2 line has some read prefetch behavior espeicially when reading small pages (4KB especially) which result in unexpected large memory consumption

humengyu2012 commented 1 year ago

Can you please modify the code when you're available? Try this pull request https://github.com/Alluxio/alluxio/pull/17636, which should achieve the same effect as the client cache. For details, please refer to this article: https://zhuanlan.zhihu.com/p/641450419.

wangw-david commented 1 year ago

@LuQQiu Thanks for your reply. @humengyu2012 Ok, I will try this pr, thanks.