hasse69 / rar2fs

FUSE file system for reading RAR archives
https://hasse69.github.io/rar2fs/
GNU General Public License v3.0
279 stars 27 forks source link

delete #135

Closed ghost closed 4 years ago

ghost commented 4 years ago

1: No code table for op: ++post

hasse69 commented 4 years ago

The idea has popped up many times, and every time it is dropped after some deeper analysis. It would be very hard to make such a non-volatile cache to work flawlessly and also the increased logic and design complexity it would require does not motivate such a huge undertaking. There are other things you can do to speed up the process significantly after a mount. Use a low value (as low as possible without loosing data) for --seek-length and use the new warmup mount option. The cache warm-up will basically start automatically in the background at mount time and populates the entire directory cache. Note that there are not just one cache but two implemented by rar2fs. One for directories lists and one for actual file properties. Those two variants serve very different purposes but they complement each other.

karibertils commented 4 years ago

RocksDB would probably be useful to implement this.

hasse69 commented 4 years ago

@karibertils thanks for the pointer, but as things stand right now there will be no effort made to save data/cache in some persistent storage. Hence it is not the storage as such that is the problem, it is reading it back that is error prone and rather non-deterministic.

hasse69 commented 4 years ago

@braderhart Unless there is something more that you wish to add I would like to close this issue.

hasse69 commented 4 years ago

@braderhart The warmup mount option was added in v1.29.0 so it is already available. AFAIK the use-case you are referring to also in the majority of the cases implies archive volumes for which there is only a single file. In that case use the --seek-length=1 option to speed up meta-data lookup.

But there is still something that I fail to understand here. Extraction speed has nothing to do with the cache really. Extraction speed itself is not that much worse than using a native file on a local file system, with the exception of the possible overhead caused by FUSE itself or if the file is compressed in which you need to face the overhead of unpacking. But for streaming data you would probably not even notice the difference? A recursive mount point lookup (e.g. ls -R) might be a bit painful for huge collections but if you know what path to search it would not cause much overhead, if any.

hasse69 commented 4 years ago

That sounds really bad? Can you tell if the size of the individual RAR archive volume affects the time it takes? You could create a volumed test archive with very small sets.

As I have stated before, only headers are read in order to access file names etc. inside an archive. For volumes we need to in fact always read two volumes due to technical reasons. But the amount of data needed from each volume file should be in the size of bytes, not even kilobytes. Could it be so that rsync does not handle this very well and tries to download the entire file first before rar2fs gets a chance to open it? Then things would become depending entirely on your network speed, I would bet you that if you put that archive on some local files system and mounted it using rar2fs it would not become close to even a second to populate the directory entry when using --seek-length=1.

hasse69 commented 4 years ago

Any updates here?

hasse69 commented 4 years ago

Closing this due to inactivity.