Open dimakuv opened 3 years ago
Does it also mean that forked children don't need to obtain Dentries
from the parent (which currently happens in Graphene)? Because we have a centralized storage of all Dentries
in Server
.
From the README and diagrams, it's hard to understand what kind of Server state is kept.
My idea was that the server doesn't store the full objects, just locks, along with the information that needs to be synchronized. The diagrams actually contain full information being exchanged: these are object IDs, and (in case of file handles) the position.
For Dentries especially, actually no data needs to be kept: the assumption is that a dentry can be invalidated, and the client will reload it once it actually needs it. I'm not sure if that's a good idea, but it seems in line with the fact that dentries are only a cache, not source of truth.
To go through your post:
Server keeps all known dentries; they are never removed
Server keeps only "sync handles" / locks[1] for known dentries, i.e. information which client holds them. They are removed once the last client forgets them, or exits. (source code: deleting unused handle)
Each dentry has a canonical path + some file metadata
Yes, but that metadata can be stored on client only, and reloaded from the filesystem if necessary.
Each dentry has a list of associated handles (? I imagine this is needed to propagate things like "file was removed")
Good point... This design doesn't handle unlinking files that are open. I guess a dentry could have a handle_count
, and it would be decreased on handle close?
Server keeps all known
handles
Same as above: there's no need to keep/synchronize information about all handles, only locking and the attributes that can be changed by client.
Handles are removed when closed by all clients
Yes. When the last client closes a handle with a given ID, it's automatically removed by the server.
Each handle references a corresponding dentry (which may be in "negative" state if file was removed) Several clients may use the same handle (depicted on the second diagram) All their accesses to this handle will be synchronized by the server (including the position pointer) Several clients may use two handles backed by the same dentry
Yes.
Clients that use handles backed by different dentries are never synchronized by the server (except for corner cases of rename and sendfile and maybe some more).
Yes, except for some initial communication with a server, to establish that they are using a given handle and dentry.
Does it also mean that forked children don't need to obtain Dentries from the parent (which currently happens in Graphene)? Because we have a centralized storage of all Dentries in Server.
Right now, they still need to, because the server actually doesn't store full dentry data. But maybe that's a mistake, and any object (handle/dentry) should actually be fully rebuildable from server data? I thought that's unnecessary complexity, but we already have it in the checkpointing system.
So yeah... it looks like I need to think about how it all fits with fork / checkpointing system.
I would encourage you to take a look at the source (or try running it), but I realize you probably don't have too much time today. In any case, thank you very much for the questions, they're very helpful!
Good point... This design doesn't handle unlinking files that are open. I guess a dentry could have a
handle_count
, and it would be decreased on handle close?
Hmm, if you don't have inodes, how would you handle the following?
asdf.txt
.asdf.txt
, but still keeps the FD.asdf.txt
.So, first of all: regardless of internal representation, how do we handle this for a file mounted from the host? We cannot delete it immediately.
I know of a solution for a similar problem in FUSE: when deleting a file that is open, rename it to <file>.fuse_hiddenXXXX
, and only really delete it when we close all handles to it.
In terms of internal represenation, I think we can do a similar thing with dentries: for instance, mark the old dentry as "hidden" and superseded by the new one. When opening a new file, you would traverse this link, same as you traverse a symlink.
I agree it sounds pretty hacky, and might make a good case for introducing inodes. On the other hand, I'm still not sure if it's justification enough, as it would make the server state more complicated.
I'm fine with this hack, just please handle this scenario correctly :)
From the README and diagrams, it's hard to understand what kind of
Server
state is kept. Also, it's not obvious what is the relationship betweenDentries
andHandles
.From what I understand:
Server
keeps all knowndentries
; they are never removedServer
keeps all knownhandles
Also:
rename
andsendfile
and maybe some more).Is this understanding correct?