Closed m0vie closed 5 years ago
Thanks for the issue report.
Currently this is how rar2fs works. You obviously cannot have the same name provided by two different sources. But to not complicate things too much the choice right now is that first level of collision is enough to trigger the filter. That means it does not matter if local path is dir/foo
and path in archive is dir/fee
, the collision is on dir
level and thus nothing from dir
in archive will show. The reason why dir/fee
can still be accessed is also due to simplification. The file/dir does exist in the cache and basically it was just easier not to try to remove it if the duplicate filter would trigger. I do not find this to be a very serious problem/limitation, thus I will tag this as an enhancement with medium priority.
I tried to reproduce your specific issue and expected to see the same result as you did. Unfortunately, I saw something much much worse. I do not see this:
mount/dir3/fileRared.txt
I get an empty directory, which obviously is wrong! So there is a bug sneaking around here, not really related to your initial report.
My bad. I accidentally had a local and empty directory called dir3
in the root. That would hide the one coming from the archive. False alarm that is!
I played around with this some more. I found a way to (temporarily) merge both caches and have them show up:
Start with
root/dir1_TEMP/fileLocal.txt
root/archive.rar:dir1/fileRared.txt
Mount and directory list /mount
, /mount/dir1_TEMP
, and /mount/dir1
to build up all caches.
Now, rename root/dir1_TEMP
to root/dir1
.
Refresh /mount/
.
Now, both
mount/dir1/fileLocal.txt
mount/dir1/fileRared.txt
show up.
(Until caches are in validated, via USR1
or by making changes through /mount
).
Yes, what you just did was in fact exploiting a bug in rar2fs. We should invalidate the cache when a local file/dir is renamed. I think we might be missing that. Note that local files are never placed in the rar2fs cache, only archive files are.
EDIT: Actually it is not a bug, rather a known limitation and nothing is really missing either.
The problem is that you are changing the original/local file system after mount. There is currently no way for rar2fs to detect that. Operations towards the local file system is not triggering the FUSE kernel module and thus no callback is made to the user file system. The only way to solve that would be to put some watch on the local file system which for several reasons is not recommended to be done from within rar2fs. If this is a use-case of real importance to you, I think your best bet is to place some inotify
monitor(s) on your local file system and trigger an invalidation of the rar2fs cache when they trigger.
Please try this patch on master/HEAD
It should allow you to merge entries from local file system with the ones coming from the archive, still with some of the previous mentioned limitations.
Hmm, that patch only makes things after the initial rename trick a bit more stable (even after local modifications both local and rared content still shows).
My goal is actually to get the local+rar content to always show up merged (even read-only would be fine).
This is what I came up with:
I am not familiar with the code-base, but my idea is to simply skip the rar cache if the directory exists locally. This probably has some performance impact but this so far works for my requirement.
Hmm, that patch only makes things after the initial rename trick a bit more stable (even after local modifications both local and rared content still shows).
Not sure what ypu mean here? This patch merges directory entries from both local and RAR archive so it works according to your initial post? No "tricks" are needed?
The directory cache has nothing to do with RAR or not. The file cache is only used for RAR files but the directory cache is common for all files.
Not sure what ypu mean here? This patch merges directory entries from both local and RAR archive so it works according to your initial post? No "tricks" are needed?
Didn't work for me. Directly after mounting only the local content showed up.
I have looked at your patch now and I think what it does sort of defeats the entire purpose of having the cache in the first place. I am not really sure I understand your use-case, but with the patch I provided you can still modify the local file system through your mount point as long is it does not try to modify anything that is in the RAR file cache, i.e. RAR contents. Such changes should be spotted directly. Changing stuff behind the back of rar2fs (or the mount point to be more precise) is not something the file system is designed for. The use-case simply is too thin. What we can do is add an option to by-pass the cache in readdir using a switch or something and do something similar to what you did. The decision is thus moved to the user do decide if it motivates such behavior or not due to the performance penalty.
Didn't work for me. Directly after mounting only the local content showed up.
Ok, that was interesting because that is not what I saw. Maybe you can provide an example archive and a local directory structure so that I can try for myself?
This is my file structure which I tested on (source directory)
$ ls -lR
.:
total 16
-rw-rw-r-- 1 hasse hasse 231 Oct 7 21:00 archive.rar
drwxrwxr-x 2 hasse hasse 4096 Oct 9 20:18 dir1
drwxrwxr-x 2 hasse hasse 4096 Oct 8 20:44 dir2
-rw-rw-r-- 1 hasse hasse 19 Oct 7 20:49 filelocal.txt
./dir1:
total 4
-rw-rw-r-- 1 hasse hasse 19 Oct 7 20:49 filelocal.txt
./dir2:
total 4
-rw-rw-r-- 1 hasse hasse 19 Oct 7 20:49 filelocal.txt
And archive.rar
contains
UNRAR 5.00 beta 7 freeware Copyright (c) 1993-2013 Alexander Roshal
Testing archive archive.rar
Testing dir1/filerared.txt OK
Testing dir3/filerared.txt OK
Testing filerared.txt OK
All OK
And this is what I see after mounting in the mount point
$ ls -lR
.:
total 24
drwxrwxr-x 2 hasse hasse 4096 Oct 9 20:02 dir1
drwxrwxr-x 2 hasse hasse 4096 Oct 8 20:44 dir2
drwxrwxr-x 2 hasse hasse 4096 Oct 7 20:50 dir3
-rw-rw-r-- 1 hasse hasse 19 Oct 7 20:49 filelocal.txt
-rw-rw-r-- 1 hasse hasse 19 Oct 7 20:51 filerared.txt
./dir1:
total 8
-rw-rw-r-- 1 hasse hasse 19 Oct 7 20:49 filelocal.txt
-rw-rw-r-- 1 hasse hasse 19 Oct 7 20:50 filerared.txt
./dir2:
total 4
-rw-rw-r-- 1 hasse hasse 19 Oct 7 20:49 filelocal.txt
./dir3:
total 4
-rw-rw-r-- 1 hasse hasse 19 Oct 7 20:50 filerared.txt
Then entering dir1 in the mount point
dir1 $ touch foobar
dir1 $ ls
filelocal.txt filerared.txt foobar
And even adding something directly to the source dir works
<move to source dir1>
dir1 $ touch feebar
<move to mount point dir1>
dir1 $ ls
feebar filelocal.txt filerared.txt foobar
Is this not in fact covering your use-case? At least that is how I understood it.
Since my memory failed me a bit on the implementation I think we need to recap on how it actually works. The directory cache is basically a cache of folder contents, but only contents provided by RAR archives. Local files are basically never cached this way, but the directories as such are. That means you can add local files to a directory (source or mount point) and that will propagate automatically to the mount point. RAR files on the other hand will not. That is where the cache is a limiting factor. But, to overcome this you can add your RAR files to the mount point instead (provided it was mounted read-write). By doing so the contents of the RAR files will still propagate since any changes to a directory in the cache will invalidate that entry by design. It also works for multi-part archives but during transit of the file set and before the archive is complete the actual names of files will display since they are temporarily treated as local files.
I see the problem/difference now.
Starting with the exact same setup:
~/rartest$ ls -lR source
source:
total 12,288
-rw-rw-r-- 1 m0vie m0vie 260 Oct 9 21:40 archive.rar
drwxrwxr-x 2 m0vie m0vie 4,096 Oct 9 21:40 dir1
drwxrwxr-x 2 m0vie m0vie 4,096 Oct 9 21:39 dir2
-rw-rw-r-- 1 m0vie m0vie 0 Oct 9 21:39 filelocal.txt
source/dir1:
total 0
-rw-rw-r-- 1 m0vie m0vie 0 Oct 9 21:39 filelocal.txt
source/dir2:
total 0
-rw-rw-r-- 1 m0vie m0vie 0 Oct 9 21:39 filelocal.txt
~/rartest$ rar t source/archive.rar
RAR 5.50 Copyright (c) 1993-2017 Alexander Roshal 11 Aug 2017
Trial version Type 'rar -?' for help
Testing archive source/archive.rar
Testing dir1/filerared.txt OK
Testing dir3/filerared.txt OK
Testing filerared.txt OK
Testing dir1 OK
Testing dir3 OK
All OK
Now, what you did:
~/rartest$ rar2fs source/ mount/
~/rartest$ ls -lR mount
mount:
total 20,480
drwxrwxr-x 2 m0vie m0vie 4,096 Oct 9 21:40 dir1
drwxrwxr-x 2 m0vie m0vie 4,096 Oct 9 21:39 dir2
drwxrwxr-x 2 m0vie m0vie 4,096 Oct 9 21:39 dir3
-rw-rw-r-- 1 m0vie m0vie 0 Oct 9 21:39 filelocal.txt
-rw-rw-r-- 1 m0vie m0vie 0 Oct 9 21:39 filerared.txt
mount/dir1:
total 4,096
-rw-rw-r-- 1 m0vie m0vie 0 Oct 9 21:39 filelocal.txt
-rw-rw-r-- 1 m0vie m0vie 0 Oct 9 21:39 filerared.txt
mount/dir2:
total 0
-rw-rw-r-- 1 m0vie m0vie 0 Oct 9 21:39 filelocal.txt
mount/dir3:
total 4,096
-rw-rw-r-- 1 m0vie m0vie 0 Oct 9 21:39 filerared.txt
All good! However, if I do this instead:
~/rartest$ rar2fs source/ mount
~/rartest$ cd mount
~/rartest/mount$ ls -l
total 20,480
drwxrwxr-x 2 m0vie m0vie 4,096 Oct 9 21:40 dir1
drwxrwxr-x 2 m0vie m0vie 4,096 Oct 9 21:39 dir2
drwxrwxr-x 2 m0vie m0vie 4,096 Oct 9 21:39 dir3
-rw-rw-r-- 1 m0vie m0vie 0 Oct 9 21:39 filelocal.txt
-rw-rw-r-- 1 m0vie m0vie 0 Oct 9 21:39 filerared.txt
~/rartest/mount$ cd dir1
~/rartest/mount/dir1$ ls -l
total 0
-rw-rw-r-- 1 m0vie m0vie 0 Oct 9 21:39 filelocal.txt
So cd-ing to the dir first somehow makes a difference.
This honestly makes no sense :( It should not matter if you cd
like this or not. And also I could not reproduce that behavior either. Can you please attach your archive here? I see a slight difference in your archive compared to mine since in yours the directories seems to have been given distinct entries. Maybe that is affecting something here, what it is I need to investigate if that is the case.
Ok, as I suspected, that had no effect
$ cd d
d $ ls -l
total 28
-rw-rw-r-- 1 hasse hasse 231 Oct 7 21:00 archive.rarx
drwxrwxr-x 2 hasse hasse 4096 Oct 9 22:01 dir1
drwxrwxr-x 2 hasse hasse 4096 Oct 8 20:44 dir2
drwxrwxr-x 2 hasse hasse 4096 Oct 9 21:39 dir3
-rw-rw-r-- 1 hasse hasse 19 Oct 7 20:49 filelocal.txt
-rw-rw-r-- 1 hasse hasse 0 Oct 9 21:39 filerared.txt
d $ cd dir1
d/dir1 $ ls -l
total 8
-rw-rw-r-- 1 hasse hasse 19 Oct 7 20:49 filelocal.txt
-rw-rw-r-- 1 hasse hasse 0 Oct 9 21:39 filerared.txt
I honestly cannot understand why it does not work the same for you? Because it should.
Hmm, I have no idea. I get this exact same behavior on both Ubuntu 19.04 and also on Cygwin using WinFSP.
Maybe the libunrar version has something to do with it?
Ok, let me try on Cygwin too. And what version of libunrar are you using?
On Cygwin it was unrarsrc-5.6.5.tar
On Ubuntu I freshly recompiled everything (including your patch) with unrarsrc-5.8.2.tar.gz.
Ok, I tried it on Cygwin, and yes, there is a problem somewhere. But for me filerared.txt
does not show up in dir1
whatever I try. So for me doing cd
or not makes no difference. Which actually does more sense than that it should be different behavior. Must dig into why it does not work properly on Cygwin.
EDIT: Nah, both archives behave the same, It works if i first do ls
on the root folder and then ls dir
. I did notice that on Cygwin nothing happens when sitting inside the mount point whereas on my Linux box a lot of activity is triggered in the background. Something does not work when the root folder is not in the cache before listing dir1. Need to figure out what that is.
Try this one.
I still see some weird behavior. It still matters which directory is accessed first.
Directly after mounting, it's likely the host OS accessing the mount point in some way, making it hard to reproduce.
But with USR1
, I see this consistently:
Cygwin:
$ rar2fs source/ x:
$ killall -USR1 rar2fs; ls /cygdrive/x/dir1
filelocal.txt
$ killall -USR1 rar2fs; ls /cygdrive/x/ > /dev/null; ls /cygdrive/x/dir1
filelocal.txt filerared.txt
Ubuntu:
$ rar2fs source/ mount/
$ killall -USR1 rar2fs; ls mount/dir1
filelocal.txt
$ killall -USR1 rar2fs; ls mount/ > /dev/null; ls mount/dir1
filelocal.txt filerared.txt
Yes, that is because my patch is limited. Try this one instead,
That did the trick :)
$ killall -USR1 rar2fs; ls /cygdrive/x/ >/dev/null; ls /cygdrive/x/dir1
filelocal.txt filerared.txt
$ killall -USR1 rar2fs; ls /cygdrive/x/dir1
filelocal.txt filerared.txt
Thanks a lot for looking into this, really appreciate it!
Ok, great! Thanks a lot for testing :) I will go through the patch once more and do some regression on it before I merge it to master. I will also re-classify this as a bug since after digging more into this I believe that is in fact what it is.
I will reopen this because the solution is not fully working.
There is still a case that does not work. The patch only solves it partly for things like ls
, but direct access to the file still does not work, e.g.
$ cat d/dir1/filerared.txt
cat: d/dir1/filerared.txt: No such file or directory
(if d
is the mount point)
I had a quick look at it and this problem is not as easy to solve as I initially thought.
Please try the latest version on master to confirm everything is still working. I will close this again once it has been verified.
I tested using the latest master and everything works as expected for me. 👍
Great! My regression seems to pass too. Closing.
Consider the following structure
After
rar2fs root/ mount/
, the following is observed when listing directories inmount/
:Note how
mount/dir1/fileRared.txt
is missing in the directory listing.However the file can be accessed directly, e.g. using
cat mount/dir1/fileRared.txt
.Interestingly, for the root directory, both
fileLocal.txt
andfileRared.txt
show up.