andrinux / fusecompress

GNU General Public License v2.0
0 stars 0 forks source link

rare but severe slowdowns when reading large files linearly #24

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
When reading large amounts of data (several dozen GB) of files with varying
sizes around 1 GB, fusecompress occasionally grinds to a near-standstill
while it seems to reread the entire file over and over again, once for each
page read. This is very strange because it happens when copying using "cp",
which invariably reads data linearly in blocks of 128 KB. One possible
explanation would be that FUSE, which always seems to request data
pagewise, does so out-of-order within such a 128 KB block in rare cases.

Investigating...

Original issue reported on code.google.com by ulrich.h...@gmail.com on 5 Sep 2008 at 10:42

GoogleCodeExporter commented 8 years ago
What I found out so far:

1. FUSE does indeed sometimes read data it has read before once more. These are 
usually ironed out by caching, but if the cache is full, problems like 
described 
above could occur. (Although I have not yet managed to run into one.)
2. The conditions for complete decompression on read cannot actually be met if 
cache_skipped is enabled. The condition (!cache_this_read || decomp_cache_size 
> 
max_decomp_cache_size) can never be true because we make sure to only cache 
data if 
decomp_cache_size < max_decomp_cache_size, so -- assuming that everything is 
page-
aligned -- the worst we can get is decomp_cache_size == max_decomp_cache_size.

I guess the solution is to fix 2. and fall back in the cache full case. It is 
IMO 
questionable, however, whether it makes sense to wait until the cache is full 
before 
doing so.

Original comment by ulrich.h...@gmail.com on 5 Sep 2008 at 12:39

GoogleCodeExporter commented 8 years ago
Another thing badly wrong with the aforementioned condition is that it may 
allow 
reads to files in direct-write mode without decompression if caching is enabled.

Original comment by ulrich.h...@gmail.com on 5 Sep 2008 at 12:52

GoogleCodeExporter commented 8 years ago
Fixed in r69:
a) The condition in question is now decomp_cache_size >= max_decomp_cache_size.
b) If it is possible to tell that the skip will not fit in the cache, we fall 
back 
instead. If we cannot fall back, we abstain from filling up the cache with data 
we 
most likely won't need for anything.
c) Skipping file->size bytes is enough to trigger a fallback now (3 * 
file->size 
before), but only data that is not cached counts towards it.

The Thing From Comment #2(tm) has also been fixed (see issue #26).

Original comment by ulrich.h...@gmail.com on 5 Sep 2008 at 3:19