Open LuQQiu opened 2 years ago
@yyongycy @yuzhu @maobaolong @ssz1997 @apc999 Create a github issue for tracking the ideas of how to support 100 million or 1 billion small files. Please share your suggestions. We will have discussions in the future
A few heuristic questions:
How to access files where goosefs does not exist but UFS exists?【follower-readOnly】
Is your feature request related to a problem? Please describe. Training against Alluxio has good performance under 100 million small files. When the training dataset reaches 100 million to 1 billion small files, training performance is largely impacted, especially when the training job does global data shuffle between epochs and multiple nodes are involved
The impacted training performance comes from that Alluxio Fuse client may not be able to store all the cached metadata in process memory. All local metadata cache is invalidated between epochs. Alluxio master needs to serve global metadata requests during each epoch.
Describe the solution you'd like
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Urgency Explain why the feature is important
Additional context Add any other context or screenshots about the feature request here.