hasse69 / rar2fs

FUSE file system for reading RAR archives
https://hasse69.github.io/rar2fs/
GNU General Public License v3.0
275 stars 27 forks source link

Regression: Reduce memory footprint during archive scan #128

Closed KyleSanderson closed 4 years ago

KyleSanderson commented 4 years ago

rar2fs currently does not list archive files anymore. Prior to this commit is good - this is the commit that breaks my archives.

rar2fs#/mnt/Web /mnt/WebU fuse ro,allow_other,uid=kyle,gid=storage,umask=0222,kernel_cache,--seek-length=2,--date-rar  0 0
FileServer /tmp/rar2fs # git bisect bad
4bc904fd182f0365e41f9130408ec6be437f2423 is the first bad commit
commit 4bc904fd182f0365e41f9130408ec6be437f2423
Author: Hans Beckerus <hans.beckerus at gmail.com>
Date:   Sat Nov 23 15:39:39 2019 +0100

    Reduce memory footprint during archive scan

    When scanning an archive for files a linked list it created with all
    files and properties before being processed by file system functions
    such as readdir. This cause some memory overhead since a lot of data
    is required to be kept resident for a longer period of time. Since the
    lifetime of the data collected is relatively short there is not need
    to pre-fetch all information like this. Instead handle file by file
    and use only a single temporary object to hold whatever meta data is
    necessary. The performance is also expected to be improved by a change
    like this since less dynamic heap allocations are required but it also
    results in a loop unwind that will increase number of functions calls.
    Measurements of some common use-cases indicated a performance increase
    of approximately 15%-20% but there are also reports of no improvement
    at all or even the opposite. The latter should however be considered a
    rare and exceptional case.

    This change was triggered by issue #122 for which a very huge archive
    was mounted with more than 100k files.

    Signed-off-by: Hans Beckerus <hans.beckerus at gmail.com>

 src/dllext.cpp | 211 +++++++++++++++++++++++++++------------------------------
 src/dllext.hpp |  14 ++--
 src/rar2fs.c   | 146 +++++++++++++++++++--------------------
 3 files changed, 174 insertions(+), 197 deletions(-)
hasse69 commented 4 years ago

Can you try to remove the --seek-length option? Maybe I overlooked something that broke that functionality.

KyleSanderson commented 4 years ago

Still broken with --seek-length removed.

hasse69 commented 4 years ago

Roger that. My own tests shows it works fine without. If I cannot find a swift solution I will revert the patch until a proper solution can be made available.

KyleSanderson commented 4 years ago

My setup is (now)...

/disks/*T* /mnt fuse.mergerfs allow_other,use_ino 0 0
/disks/*T*/Web /mnt/Web fuse.mergerfs allow_other,use_ino,nonempty 0 0
rar2fs#/mnt/Web /mnt/WebU fuse ro,allow_other,uid=kyle,gid=storage,umask=0222,kernel_cache,--date-rar 0 0

I use sshfs on the other end (yeah - FUSE heavy here) and it's particularly fiddly about the options passed. This was tested by doing ls on a local directory (SSHFS has no part in this now) and seeing my .nfos and .sfvs but no .mkvs or similar. Path is something like /mnt/WebU/Television/CBC Marketplace/Video1/media.r{00..66}.

Happy to test any patches

hasse69 commented 4 years ago

Closing this since problematic commit has been reverted.

KyleSanderson commented 4 years ago

Confirmed - master now results in a working build again 🎉

hasse69 commented 4 years ago

Thanks @KyleSanderson , even though this issue it closed, would it be possible for you to try this patch on top of master/HEAD?

issue128.patch.txt

It resolves the problems I could find with the original solution.

KyleSanderson commented 4 years ago

Looking good here. Applied with git apply issue128.patch.txt.

hasse69 commented 4 years ago

Thanks. Will merge this back to master again later.

hasse69 commented 4 years ago

@KyleSanderson Really sorry, but would it be possible for you to try this one on top of 27011db06c724c4bc90806c1a9e7fb0c0289fed1 again?

issue128_2.patch.txt

# git checkout 27011db
# patch -p1 < issue128_2.patch.txt

I really need for you to confirm this one so that I can continue on issue #124 since this is a blocker.

hasse69 commented 4 years ago

Temporarily re-opening this until last patch has been verified.

KyleSanderson commented 4 years ago

@hasse69 ACK. Runs OK here on a couple test archives (didn't thoroughly test - but an open / read works).

hasse69 commented 4 years ago

@KyleSanderson thanks for testing