pwmarcz / fs-demo

Filesystem demo for Graphene
0 stars 0 forks source link

What does server state contain exactly? #3

Open dimakuv opened 3 years ago

dimakuv commented 3 years ago

From the README and diagrams, it's hard to understand what kind of Server state is kept. Also, it's not obvious what is the relationship between Dentries and Handles.

From what I understand:

Also:

Is this understanding correct?

dimakuv commented 3 years ago

Does it also mean that forked children don't need to obtain Dentries from the parent (which currently happens in Graphene)? Because we have a centralized storage of all Dentries in Server.

pwmarcz commented 3 years ago

From the README and diagrams, it's hard to understand what kind of Server state is kept.

My idea was that the server doesn't store the full objects, just locks, along with the information that needs to be synchronized. The diagrams actually contain full information being exchanged: these are object IDs, and (in case of file handles) the position.

For Dentries especially, actually no data needs to be kept: the assumption is that a dentry can be invalidated, and the client will reload it once it actually needs it. I'm not sure if that's a good idea, but it seems in line with the fact that dentries are only a cache, not source of truth.

To go through your post:

Server keeps all known dentries; they are never removed

Server keeps only "sync handles" / locks[1] for known dentries, i.e. information which client holds them. They are removed once the last client forgets them, or exits. (source code: deleting unused handle)

Each dentry has a canonical path + some file metadata

Yes, but that metadata can be stored on client only, and reloaded from the filesystem if necessary.

Each dentry has a list of associated handles (? I imagine this is needed to propagate things like "file was removed")

Good point... This design doesn't handle unlinking files that are open. I guess a dentry could have a handle_count, and it would be decreased on handle close?

Server keeps all known handles

Same as above: there's no need to keep/synchronize information about all handles, only locking and the attributes that can be changed by client.

Handles are removed when closed by all clients

Yes. When the last client closes a handle with a given ID, it's automatically removed by the server.

Each handle references a corresponding dentry (which may be in "negative" state if file was removed) Several clients may use the same handle (depicted on the second diagram) All their accesses to this handle will be synchronized by the server (including the position pointer) Several clients may use two handles backed by the same dentry

Yes.

Clients that use handles backed by different dentries are never synchronized by the server (except for corner cases of rename and sendfile and maybe some more).

Yes, except for some initial communication with a server, to establish that they are using a given handle and dentry.

Does it also mean that forked children don't need to obtain Dentries from the parent (which currently happens in Graphene)? Because we have a centralized storage of all Dentries in Server.

Right now, they still need to, because the server actually doesn't store full dentry data. But maybe that's a mistake, and any object (handle/dentry) should actually be fully rebuildable from server data? I thought that's unnecessary complexity, but we already have it in the checkpointing system.

pwmarcz commented 3 years ago

So yeah... it looks like I need to think about how it all fits with fork / checkpointing system.

I would encourage you to take a look at the source (or try running it), but I realize you probably don't have too much time today. In any case, thank you very much for the questions, they're very helpful!

mkow commented 3 years ago

Good point... This design doesn't handle unlinking files that are open. I guess a dentry could have a handle_count, and it would be decreased on handle close?

Hmm, if you don't have inodes, how would you handle the following?

  1. A creates and opens asdf.txt.
  2. A unlinks asdf.txt, but still keeps the FD.
  3. B creates and opens asdf.txt.
  4. A i B should work on different files at this point.
pwmarcz commented 3 years ago

So, first of all: regardless of internal representation, how do we handle this for a file mounted from the host? We cannot delete it immediately.

I know of a solution for a similar problem in FUSE: when deleting a file that is open, rename it to <file>.fuse_hiddenXXXX, and only really delete it when we close all handles to it.

In terms of internal represenation, I think we can do a similar thing with dentries: for instance, mark the old dentry as "hidden" and superseded by the new one. When opening a new file, you would traverse this link, same as you traverse a symlink.

I agree it sounds pretty hacky, and might make a good case for introducing inodes. On the other hand, I'm still not sure if it's justification enough, as it would make the server state more complicated.

mkow commented 3 years ago

I'm fine with this hack, just please handle this scenario correctly :)