hasse69 / rar2fs

FUSE file system for reading RAR archives
https://hasse69.github.io/rar2fs/
GNU General Public License v3.0
279 stars 27 forks source link

Random RARs not showing up inside mount #69

Closed zappepappe closed 7 years ago

zappepappe commented 7 years ago

Hi again.

Figured I should make a new issue for this, even though I mentioned it in the other thread, since it is unrelated. During our earlier testing I found that some files did not show up inside the mount. Did a git bisect of the issue and found this commit to be the reason:

a1d898e27bb12c27e0742c7ade4991514946ec40 is the first bad commit commit a1d898e27bb12c27e0742c7ade4991514946ec40 Author: Hans Beckerus Date: Fri Jan 13 10:33:21 2017 +0100

Improve getattr folder cache logic

The getattr type calls only checks if the special folder cache
entry exists, and if so, skips the slow search for files inside
RAR archives.
This patch will make sure these calls also populates the cache
if needed. This will drastically improve the behavior of e.g.
auto-completion.

Signed-off-by: Hans Beckerus <hans.beckerus at gmail.com>

:100644 100644 20177938806b870edddd2d70b7f80ac0a4867d63 d799e24cbc712370e2a9d4f8ca0777924d4816e6 M filecache.c :100644 100644 6b6d423802b209217b0eb92a523752feddb0f581 89c94a9e641edabd182c6b621bf85226fc858826 M rar2fs.c

hasse69 commented 7 years ago

Thanks for the report. However, you need to be more precise in what you see. I really doubt this patch makes files randomly disappear. It might be that files do not show up because of the cache already being populated and for new files to be discovered it needs to be invalidated. Invalidation is automatic if you write to the mount point rather than to the source directory. It might also be that you stumbled into a problem now solved in latest version on master.

98486d071ef8d5e42d04f9d3fbe4caa5c4a9f6c7

zappepappe commented 7 years ago

Have not been able to reproduce with 98486d0 as of yet. Will try it in more normal conditions later. But it looks like it solved it.

hasse69 commented 7 years ago

Ok, I am currently working on the cache logic since it needs some tweaking due to issue #66

zappepappe commented 7 years ago

I was able to reproduce it now. Not quite sure exactly what triggers it. Might have something to do with samba. Will not be able to test more for some time but will report back when I have. Hopefully I'll be able to give a better explanation then as well.

hasse69 commented 7 years ago

Ok, a simple test when this happens is to login to the server and check the folder manually. If the files are there then this is some other problem.

zappepappe commented 7 years ago

Looks like it depends on how I first browse the mount. If I first browse with cd/ls on the server, everything looks fine. But if I first browse through the network share (samba) files will be missing. And yes, if they are missing in the network share they are also missing inside the mount on the server with cd/ls.

hasse69 commented 7 years ago

Ok, that sounds a bit worrying :( And I have no clue why it behaves like that. Try the path attached to issue #66 and see if it makes any difference. Otherwise please start rar2fs with -d so that I can check what calls we get from fuse.

hasse69 commented 7 years ago

And you said this behaves the same with older versions like the released v1.23.1? In that case it cannot be related to the new cache function.

zappepappe commented 7 years ago

No no, I did a git bisect as I said. I bisected between 1.23.1 and the then latest git and the behaviour was not reproducible before that commit.

hasse69 commented 7 years ago

I am a bit lost here. Did you not also say that behavior was different depending what user was owning the mount?

zappepappe commented 7 years ago

No, I have not. But as a coincident I am currently waiting for a large copy of files to be done and was going to test that. Still trying to figure out a simple way for you to reproduce without having to mirror my entire system. It looks like having many large archives and doing first browse through the mount with network share or possibly Windows Explorer triggers it.

hasse69 commented 7 years ago

If this is happening in connection to a copy of new files to the mount point the patch from issue #66 might/should help.

zappepappe commented 7 years ago

It is not. I am copying files so that I won't have to experiment with real data.

hasse69 commented 7 years ago

Alright. But rar2fs does not know anything about samba. But the cache must in some way be involved since you say it also does not work after doing a normal ls. When you see this try a kill -USR1 on the main process.

zappepappe commented 7 years ago

Tried kill -USR1. First even more files disappeared so I tried it again and then they reappeared.

hasse69 commented 7 years ago

And these were all accesses made from you Windows Samba share? In that case I need some traces of what is going on. And it is using latest master with the patch from issue #66?

zappepappe commented 7 years ago

Master, yes. Patch, no. Still trying to figure out exactly what is going on. But I think I might be on to something.

hasse69 commented 7 years ago

Sounds great! Any help to nail down the exact problem is appreciated!

zappepappe commented 7 years ago

I believe it has to do with permissions, is that possible? I have one by one made my test folder more alike to my real folder. Put it on a different dataset (kind of ZFS partition), nothing. Changed the dataset to ACL permissions, nothing. Made a more complicated permission structure where I am not owner and special groups decide access, missing files. When browsing the samba share with Explorer I get direct access to files and folders, but when I go to properties and check permissions the groups that actually gives me access to the share takes a bit of time to load. Is it in anyway possible that could mess with the cache?

hasse69 commented 7 years ago

No, the cache and rar2fs does not really care much about permissions. The default_permissions option is set automatically by rar2fs which leaves all this to FUSE. Try removing it in rar2fs.c:

Change

        fuse_opt_add_arg(&args, "-osync_read,fsname=rar2fs,subtype=rar2fs,default_permissions");

to

        fuse_opt_add_arg(&args, "-osync_read,fsname=rar2fs,subtype=rar2fs");
hasse69 commented 7 years ago

There are however a few calls to access(2) here and there. I guess if read access is not granted for some reason it might affect internals in rar2fs.

zappepappe commented 7 years ago

Removing "default_permissions" did not help. I mounted with -d and there is no mention at all of the missing files.

hasse69 commented 7 years ago

Ok, what version are you on? Latest master? I can send you a patch that replace the use of access to fopen instead. The access(2) system calls use real permissions rather than the effective one. Maybe it is related. I am a bit confused abut this to be honest. I have never heard of people complaining about this before.

zappepappe commented 7 years ago

rar2fs v1.23.1-git98486d0 Since I can't reproduce without the cache commit that is fairly new and have to do pretty elaborate things to make it show and the fact that I'm on a less used platform might be the reason.

hasse69 commented 7 years ago

patch.txt

Try this one, both with and without default permissions.

zappepappe commented 7 years ago

Haven't used diffs with git before. For speed, what would be the command? Just git diff file?

hasse69 commented 7 years ago

Simply do

patch < patch.txt
hasse69 commented 7 years ago

Btw, are you using the -o allow_other mount option?

zappepappe commented 7 years ago

No luck. Used the one without "default_permissions" as long as the patch didn't add it back. Yes, allow_other is the only option I'm using. Otherwise I would not be able to access the mount at all. Should I try with "default_permissions"?

hasse69 commented 7 years ago

You can try. But I do not give it much hope.

zappepappe commented 7 years ago

Nope. Speed somehow seem to affect the chances. If I go through the folders quickly it seems more likely I will get missing files. Clearing the cache while I'm inside the folder often get the files back.

I won't be able to test more for the time being, sorry. But if you come up with anything I will give it a try as soon as I can.

hasse69 commented 7 years ago

I have no clue really. If the files do not show up in the logs then I am not sure what to look at. But if you are sure this is a unique problem to later versions of rar2fs and you have tried the patch from issue #66 then obviously it must be related to the cache somehow.

hasse69 commented 7 years ago

Any status here?

zappepappe commented 7 years ago

Sorry that I haven't made any updates but I hadn't tested anything since my last post.

Tried latest master d6666ec today and files are still missing.

hasse69 commented 7 years ago

Bad news then :( I am not really sure what I can do here unless you can narrow it down a bit. The commit you point to as being the trigger of this problem should not have the effect you describe. Also, I am a bit confused to why I am not seeing it too? Can you also try attached patch (based on latest master) for reference. Need to check if it makes a difference or not. But I really doubt it though :(

patch.txt

zappepappe commented 7 years ago

Will try the patch and do some other testing tomorrow.

While it might not be that commit that is at fault, it could have triggered something else that the cache touches. Since SIGUSR1 can make the files reappear it has to have something to do with the cache, if not directly then by some code it uses to populate itself. It clearly matters how I first access the files/folders, and I guess that is when the cache is being created.

Would it be hard to make a mount option to disable the cache if it's hard to find the cause and revisit this in the future if any new information comes up? Or do you want to figure this out before releasing a new version?

hasse69 commented 7 years ago

I want to figure this out.

hasse69 commented 7 years ago

If this was a simple bug in the cache logic I should see it too. I do not. So you either have found an access pattern that I have missed or it is something platform specific. The latter would surprise me though.

hasse69 commented 7 years ago

The patch is BIG. It basically separates the file- from the directory cache. Maybe it is related to your problem that hash collisions in a shared cache implementation is not handled properly. But let's see.

zappepappe commented 7 years ago

Initial testing looks very good. Tried it a few times now and all files are always showing up. Usually some if not all of the rars have been missing on my test mount. Will keep trying some more, but if it's not fixed it's at least a lot better.

hasse69 commented 7 years ago

Going from bad to good news then hopefully ;)

zappepappe commented 7 years ago

It has been working without any problems, even on my real mount. I do not know if it was fixed with this patch or something else, but samba has had troubles detaching from the mount before, forcing me to reload samba in order to unmount. But now it works as expected, after leaving the folder/directory and waiting a bit it unmounts cleanly.

hasse69 commented 7 years ago

Ok, you can easily check if the patch matters by running latests master.

hasse69 commented 7 years ago

I will merge this patch later and close this issue if that is ok with you.

zappepappe commented 7 years ago

Fine by me. Thank you.

hasse69 commented 7 years ago

Note that there is another patch pending removing the internal support for RAR files including other RAR files. If this support is needed stacking should be used instead. Just giving you a heads up...

zappepappe commented 7 years ago

I already use stacking with --flat-only since our earlier issue with scanning/indexing. But thank you for the heads up.

hasse69 commented 7 years ago

Yes, that was my guess too. But the --flat-only option is being removed so after that patch is merged you can no longer use it.