We observed a large read amplification when user is doing an incremental but non-continuous sequential read on large amount of files. After labeling object GET requests with purpose, we found prefetching is the main cause of the read amplification.
It turns out the deduplication in prefetcher::do method does not work as expected especially when there is only one prefetcher (the default behavior), which caused duplicated prefetching. Instead, it's better to do deduplication before inserting into the pending queue.
We observed a large read amplification when user is doing an incremental but non-continuous sequential read on large amount of files. After labeling object GET requests with purpose, we found prefetching is the main cause of the read amplification.
It turns out the deduplication in
prefetcher::do
method does not work as expected especially when there is only one prefetcher (the default behavior), which caused duplicated prefetching. Instead, it's better to do deduplication before inserting into thepending
queue.