Closed iain-macdonald closed 1 week ago
Inode.Forgotten() to detect when Inos can be reused.
I would have to think this through in depth to double-check, but Forgotten() is intended for enabling cleanup in a scenario when you are sure the inode cannot be revived. If you are (re)using the inodes, then clearly you are reviving the inode. In a reuse scenario, the value of Forgotten()
(or any other call that does no locking) can always be out of date by the time you make a decision based on the value.
If you want to securely reuse inodes, I'd put the eligible inodes in a global queue protected by mutex. You can put them in the queue if they are Forgotten
. For construction, you can take them out of the queue one by one.
Does that make sense?
maybe I misunderstand the bugreport, though. Who or what is issuing the Forgotten
calls? IIRC, the locking should ensure that the inode only becomes visible in the tree atomically with the Forgotten
becoming false.
The code I'm talking about is here and here. It's not reusing inodes, it's reusing the uint64
identifier Attr.Ino (is that okay?). It reuses them only after the previous Inode with that ID has been Forgotten()
, but as I mentioned, there is a race between when NewInode()
is called and when Forgotten()
is called.
"Forgotten() is intended for enabling cleanup in a scenario when you are sure the inode cannot be revived." leads me to believe the current behavior of Forgotten()
is not quite correct, as it can be called immediately after NewInode()
but before being added to the filesystem, so it can return true
, then false
, and then later true
again. Though the code I linked may also be misbehaving in some way.
To be clear, I'm also fine with adding an Initialized()
function to check for the case I describe.
Hey Han-Wen, did you have time to think about this and figure out which proposal I mentioned is more desirable?
FWIW it looks like Forgotten()
is not too widely used among open-source repos hosted in GitHub, if that's any signal about the broader impact of changing its behavior: https://sourcegraph.com/search?q=context:global+.Forgotten%28%29+lang:Go+&patternType=keyword&sm=0
did you have time to think about this and figure out which proposal I mentioned is more desirable?
unfortunately, I haven't. Familial duties have been taking all my time.
Hey there, I created a PR with a workaround for this here: https://github.com/hanwen/go-fuse/pull/520 -- any chance someone could take a look at it this week? Thank you!
sorry, no. personal circumstances.
Hello, wondering if you've had a moment to think about this bug or the proposed workaround in the last couple of months?
Mildly off-topic: when you reuse inode numbers, tools like rsync (with the -H flag) will think the files are hard links. Are you aware of this?
Looks like "cp -a" too.
See https://review.gerrithub.io/c/hanwen/go-fuse/+/1200068 for how you could implement this. Does this work for you? and does a stress test under the race detector come out clean?
This may still be fishy though. I wrote NewInode() assuming that the InodeEmbedder is "fresh": initialization routines do not need synchronization, as concurrent access is only possible after the NewXxxx succeeds: the return value is shared to multiple goroutines. Reading between the lines, you are passing something into NewInode that is already accessible to other routines?
IIUC, the problem is in soci-snapshotter, which you didn't write but are using? Could you convince one of their authors to participate in the discussion here, so I understand better what is going on?
In retrospect, I think adding the Forgotten
method was premature: it's not actually used (ie. tested) within the go-fuse package.
The sourcegraph links you post are mostly from jacobsa's FUSE package which has an example that defines a Forgotten
method.
That is actually discussed in stargz-snapshotter repo.
What we're trying to do is to create a child *fs.Inode
using NewInode()
in the parent's Lookup
method implementation. This Lookup
implementation shares *fs.Inode
returned by NewInode()
to other goroutines (before returning from Lookup
). Other goroutines take the *fs.Inode
and can call Forgotten()
to check if that Inode is forgotten or not. This check can happen even before that Lookup
method returns. So this can hit the condition described in the above:
Forgotten() returns true briefly between Inode construction and Inode initialization
I spent some time thinking about this, but I would rather not change the meaning of Forgotten
.
How about I delay the call to OnAdd
until after addNewChild? Then you can make the new or recycled node visible to the rest of your system in an OnAdd
method.
How about I delay the call to OnAdd until after addNewChild? Then you can make the new or recycled node visible to the rest of your system in an OnAdd method.
Thanks. This sounds good to me.
Can you have a look at https://review.gerrithub.io/c/hanwen/go-fuse/+/1200132 ? Would that work for you?
I am not satisfied with the solution, though. Doing the callback more than once requires you to complicate logic, and overall it doesn't solve the problem that Forgotten
is a hard to reason about.
Maybe I can simply post Inodes onto a channel once I notice that they have become unreachable? That would probably let you write your code more naturally.
Thanks for all of the input, rfjakob, Han-Wen, and Kohei!
Posting forgotten inodes to a channel would probably work for us, but I'll let Kohei confirm.
I also have another possible solution, along with a test that demonstrates the issue in https://github.com/hanwen/go-fuse/pull/520/files. I'm not particularly tied to submitting that, though, I just proposed it as the least invasive solution I could think of.
Can you have a look at https://review.gerrithub.io/c/hanwen/go-fuse/+/1200132 ? Would that work for you?
Thanks. Overall looks good to me. It would also be great if the user can register OnTreeAdd
callback to fs.MemRegularFile
and fs.MemSymlink
as well.
Maybe I can simply post Inodes onto a channel once I notice that they have become unreachable? That would probably let you write your code more naturally.
Yes, this also sounds useful.
I also have another possible solution, along with a test that demonstrates the issue in https://github.com/hanwen/go-fuse/pull/520/files. I'm not particularly tied to submitting that, though, I just proposed it as the least invasive solution I could think of.
I don't understand why that would work; aren't you recycling inodes? Eventually you'll be returning a recycled (Initialized() == true) Inode as a child, wouldn't that have the same problems you sketch earlier? Second, the value Initialized()==false
is useless, as it could be outdated by the time you inspect the boolean.
It would also be great if the user can register OnTreeAdd callback to fs.MemRegularFile and fs.MemSymlink as well.
you can default-embed these to get whatever custom behavior you want.
I also have another possible solution, along with a test that demonstrates the issue in https://github.com/hanwen/go-fuse/pull/520/files. I'm not particularly tied to submitting that, though, I just proposed it as the least invasive solution I could think of.
I don't understand why that would work; aren't you recycling inodes? Eventually you'll be returning a recycled (Initialized() == true) Inode as a child, wouldn't that have the same problems you sketch earlier? Second, the value
Initialized()==false
is useless, as it could be outdated by the time you inspect the boolean.
The code is creating new inode objects with inode numbers that have been "Forgotten": https://github.com/containerd/stargz-snapshotter/blob/2030a405041db038298c3df08bd4ec4d7ee50252/store/fs.go#L215-L220 IIUC the newly created inodes will have Initialized() = false
in that case. And it's fine for our use case if Initialized()
returns a false negative because in that case we won't reuse the inode number.
It would also be great if the user can register OnTreeAdd callback to fs.MemRegularFile and fs.MemSymlink as well.
you can default-embed these to get whatever custom behavior you want.
@ktock - can you look over https://review.gerrithub.io/c/hanwen/go-fuse/+/1201090 and see if that works for you?
@hanwen Thanks, it looks good to me.
I recently discovered a bug in a go-fuse client related to the
Inode.Forgotten()
function. Roughly speaking, the client has logic for creating and reusing Inode Inos, and relies on callingInode.Forgotten()
to detect when Inos can be reused. However, Forgotten() returnstrue
briefly between Inode construction and Inode initialization, so races can occur and lead to Ino reuse. This is a bit awkward to fix in the client, so I thought I'd ask ifInode.Forgotten()
should really returntrue
ifInode.lookupCount == 0
as in that case the Inode has not actually been initialized yet, or alternatively, if anInitialized()
function could be added allowing the caller to check if an Inode has been initialized or not?