Alluxio / alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud
https://www.alluxio.io
Apache License 2.0
6.86k stars 2.94k forks source link

The problem of medium to large file fuse concurrent reading #14586

Open jayzhenghan opened 2 years ago

jayzhenghan commented 2 years ago

Problem1: Compared with single concurrent reading of the same file(300MB), the delay of multiple concurrency increases dozens of times。fuse options(kernel_cache, max_readahead=32) 8 concurrency reads image

5 concurrency reads image

3 concurrency reads image

1 concurrency read image

Problem2 fuse options(kernel_cache) alluxio received out-of-order read requests from readahead of the kernel image

image

Problem3 Serious read magnification。

  1. 3 concurrent read 300MB.file(100% cache), image
  2. 3 concurrent read 300MB.file (no cache) image
jayzhenghan commented 2 years ago

@LuQQiu @maobaolong @yuzhu

maobaolong commented 2 years ago

@jayzhenghan Would you please pasted the related resources here, something like read.py

jayzhenghan commented 2 years ago

@jayzhenghan Would you please pasted the related resources here, something like read.py

fd = open("test/300MB", "r") print fd line = fd.readlines() fd.close()

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.