hanwen / go-fuse

FUSE bindings for Go
Other
2.03k stars 324 forks source link

Race in Inode.Forgotten() #504

Closed iain-macdonald closed 1 week ago

iain-macdonald commented 6 months ago

I recently discovered a bug in a go-fuse client related to the Inode.Forgotten() function. Roughly speaking, the client has logic for creating and reusing Inode Inos, and relies on calling Inode.Forgotten() to detect when Inos can be reused. However, Forgotten() returns true briefly between Inode construction and Inode initialization, so races can occur and lead to Ino reuse. This is a bit awkward to fix in the client, so I thought I'd ask if Inode.Forgotten() should really return true if Inode.lookupCount == 0 as in that case the Inode has not actually been initialized yet, or alternatively, if an Initialized() function could be added allowing the caller to check if an Inode has been initialized or not?

hanwen commented 6 months ago

Inode.Forgotten() to detect when Inos can be reused.

I would have to think this through in depth to double-check, but Forgotten() is intended for enabling cleanup in a scenario when you are sure the inode cannot be revived. If you are (re)using the inodes, then clearly you are reviving the inode. In a reuse scenario, the value of Forgotten() (or any other call that does no locking) can always be out of date by the time you make a decision based on the value.

If you want to securely reuse inodes, I'd put the eligible inodes in a global queue protected by mutex. You can put them in the queue if they are Forgotten. For construction, you can take them out of the queue one by one.

Does that make sense?

hanwen commented 6 months ago

maybe I misunderstand the bugreport, though. Who or what is issuing the Forgotten calls? IIRC, the locking should ensure that the inode only becomes visible in the tree atomically with the Forgotten becoming false.

iain-macdonald commented 6 months ago

The code I'm talking about is here and here. It's not reusing inodes, it's reusing the uint64 identifier Attr.Ino (is that okay?). It reuses them only after the previous Inode with that ID has been Forgotten(), but as I mentioned, there is a race between when NewInode() is called and when Forgotten() is called.

iain-macdonald commented 6 months ago

"Forgotten() is intended for enabling cleanup in a scenario when you are sure the inode cannot be revived." leads me to believe the current behavior of Forgotten() is not quite correct, as it can be called immediately after NewInode() but before being added to the filesystem, so it can return true, then false, and then later true again. Though the code I linked may also be misbehaving in some way.

To be clear, I'm also fine with adding an Initialized() function to check for the case I describe.

iain-macdonald commented 6 months ago

Hey Han-Wen, did you have time to think about this and figure out which proposal I mentioned is more desirable?

FWIW it looks like Forgotten() is not too widely used among open-source repos hosted in GitHub, if that's any signal about the broader impact of changing its behavior: https://sourcegraph.com/search?q=context:global+.Forgotten%28%29+lang:Go+&patternType=keyword&sm=0

hanwen commented 6 months ago

did you have time to think about this and figure out which proposal I mentioned is more desirable?

unfortunately, I haven't. Familial duties have been taking all my time.

iain-macdonald commented 3 months ago

Hey there, I created a PR with a workaround for this here: https://github.com/hanwen/go-fuse/pull/520 -- any chance someone could take a look at it this week? Thank you!

hanwen commented 3 months ago

sorry, no. personal circumstances.

iain-macdonald commented 1 month ago

Hello, wondering if you've had a moment to think about this bug or the proposed workaround in the last couple of months?

rfjakob commented 1 month ago

Mildly off-topic: when you reuse inode numbers, tools like rsync (with the -H flag) will think the files are hard links. Are you aware of this?

rfjakob commented 1 month ago

Looks like "cp -a" too.

hanwen commented 1 month ago

See https://review.gerrithub.io/c/hanwen/go-fuse/+/1200068 for how you could implement this. Does this work for you? and does a stress test under the race detector come out clean?

This may still be fishy though. I wrote NewInode() assuming that the InodeEmbedder is "fresh": initialization routines do not need synchronization, as concurrent access is only possible after the NewXxxx succeeds: the return value is shared to multiple goroutines. Reading between the lines, you are passing something into NewInode that is already accessible to other routines?

hanwen commented 1 month ago

IIUC, the problem is in soci-snapshotter, which you didn't write but are using? Could you convince one of their authors to participate in the discussion here, so I understand better what is going on?

In retrospect, I think adding the Forgotten method was premature: it's not actually used (ie. tested) within the go-fuse package.

The sourcegraph links you post are mostly from jacobsa's FUSE package which has an example that defines a Forgotten method.

ktock commented 1 month ago

That is actually discussed in stargz-snapshotter repo.

What we're trying to do is to create a child *fs.Inode using NewInode() in the parent's Lookup method implementation. This Lookup implementation shares *fs.Inode returned by NewInode() to other goroutines (before returning from Lookup). Other goroutines take the *fs.Inode and can call Forgotten() to check if that Inode is forgotten or not. This check can happen even before that Lookup method returns. So this can hit the condition described in the above:

Forgotten() returns true briefly between Inode construction and Inode initialization

hanwen commented 1 month ago

I spent some time thinking about this, but I would rather not change the meaning of Forgotten.

How about I delay the call to OnAdd until after addNewChild? Then you can make the new or recycled node visible to the rest of your system in an OnAdd method.

ktock commented 1 month ago

How about I delay the call to OnAdd until after addNewChild? Then you can make the new or recycled node visible to the rest of your system in an OnAdd method.

Thanks. This sounds good to me.

hanwen commented 1 month ago

Can you have a look at https://review.gerrithub.io/c/hanwen/go-fuse/+/1200132 ? Would that work for you?

I am not satisfied with the solution, though. Doing the callback more than once requires you to complicate logic, and overall it doesn't solve the problem that Forgotten is a hard to reason about.

Maybe I can simply post Inodes onto a channel once I notice that they have become unreachable? That would probably let you write your code more naturally.

iain-macdonald commented 1 month ago

Thanks for all of the input, rfjakob, Han-Wen, and Kohei!

Posting forgotten inodes to a channel would probably work for us, but I'll let Kohei confirm.

I also have another possible solution, along with a test that demonstrates the issue in https://github.com/hanwen/go-fuse/pull/520/files. I'm not particularly tied to submitting that, though, I just proposed it as the least invasive solution I could think of.

ktock commented 1 month ago

Can you have a look at https://review.gerrithub.io/c/hanwen/go-fuse/+/1200132 ? Would that work for you?

Thanks. Overall looks good to me. It would also be great if the user can register OnTreeAdd callback to fs.MemRegularFile and fs.MemSymlink as well.

Maybe I can simply post Inodes onto a channel once I notice that they have become unreachable? That would probably let you write your code more naturally.

Yes, this also sounds useful.

hanwen commented 1 month ago

I also have another possible solution, along with a test that demonstrates the issue in https://github.com/hanwen/go-fuse/pull/520/files. I'm not particularly tied to submitting that, though, I just proposed it as the least invasive solution I could think of.

I don't understand why that would work; aren't you recycling inodes? Eventually you'll be returning a recycled (Initialized() == true) Inode as a child, wouldn't that have the same problems you sketch earlier? Second, the value Initialized()==false is useless, as it could be outdated by the time you inspect the boolean.

It would also be great if the user can register OnTreeAdd callback to fs.MemRegularFile and fs.MemSymlink as well.

you can default-embed these to get whatever custom behavior you want.

iain-macdonald commented 3 weeks ago

I also have another possible solution, along with a test that demonstrates the issue in https://github.com/hanwen/go-fuse/pull/520/files. I'm not particularly tied to submitting that, though, I just proposed it as the least invasive solution I could think of.

I don't understand why that would work; aren't you recycling inodes? Eventually you'll be returning a recycled (Initialized() == true) Inode as a child, wouldn't that have the same problems you sketch earlier? Second, the value Initialized()==false is useless, as it could be outdated by the time you inspect the boolean.

The code is creating new inode objects with inode numbers that have been "Forgotten": https://github.com/containerd/stargz-snapshotter/blob/2030a405041db038298c3df08bd4ec4d7ee50252/store/fs.go#L215-L220 IIUC the newly created inodes will have Initialized() = false in that case. And it's fine for our use case if Initialized() returns a false negative because in that case we won't reuse the inode number.

It would also be great if the user can register OnTreeAdd callback to fs.MemRegularFile and fs.MemSymlink as well.

you can default-embed these to get whatever custom behavior you want.

hanwen commented 2 weeks ago

@ktock - can you look over https://review.gerrithub.io/c/hanwen/go-fuse/+/1201090 and see if that works for you?

ktock commented 1 week ago

@hanwen Thanks, it looks good to me.