Open hanwen opened 1 year ago
Hey! Even though this sounds like a good idea at face value, there are some big differences between FUSE and NFSv4 that are hard to bridge. Most importantly, the lifecycle of nodes is completely different. So I get where the desire comes from, but it may not be realistic to address that without making major API changes. This is also why I don't really believe a project like https://www.fuse-t.org/ can actually work, as it tries to stick to libfuse's API.
With FUSE, the kernel and the FUSE server are effectively in sync on which nodes are in play. Whenever the kernel releases the state associated with an inode, it will call FORGET, allowing the FUSE server to do the same thing. With NFSv4 there is no such thing. If an NFSv4 client calls LOOKUP, it receives a node ID that must remain usable for the entire lifetime of that object (i.e., until the last unlink()
occurs and the file is no longer opened). There is no FORGET call that the client issues to let the server know it no longer cares about it. So a naive FUSE<->NFSv4 bridge would most likely end up leaking nodes a lot.
It's worth mentioning that the above only applies to persistent handles. NFSv4 also has this concept of volatile handles, where the server can return a specific error to request that the client retries its LOOKUPs using a cached value of the pathname. That would counteract this. Unfortunately, this mechanism is not as robust as what FUSE does when the file system is mutated in parallel. Furthermore the macOS NFSv4 client doesn't support volatile handles.
Inside of Buildbarn I had to solve this by adding this HandleAllocator abstraction that implementations of file system objects have to use to effectively convey the lifetime of objects to the NFSv4 server. NFSHandleAllocator essentially acts as a registry of every live object in the file system. The equivalent FUSEHandleAllocator is simpler, in that it has no registry.
Useful pieces of code/docs within the Buildbarn tree:
Hope that helps!
thanks, very insightful. You say
So a naive FUSE<->NFSv4 bridge would most likely end up leaking nodes a lot.
doesnt the semantics of NFS mean that an NFS server has to keep metadata for every inode ever accessed in memory? I can see that it has to use a lot of memory, but it would not necessarily leak nodes, right?
That’s a good question! If you were to create an NFS server that exposes the contents of a local file system using regular POSIX APIs, then yes. You would either leak memory or not support persistent handles.
How most operating systems solve this is that they provide special APIs for looking up file handles of paths and vice versa. For example, FreeBSD has getfh() and fhopen() system calls for that purpose:
https://man.freebsd.org/cgi/man.cgi?query=getfh&sektion=2&n=1
Those system calls basically allow you to access the files in a file system as if it were a flat namespace.
Linux has open_by_handle_at(2), but it requires root permissions.
Seems like the userspace nfs server nfs-ganesha used to have a handle database but switched to open_by_handle_at at some point: https://lists.nfs-ganesha.org/archives/list/support@lists.nfs-ganesha.org/message/DYLG45ZFNJVFTTSUNZIIDE2B5OL4NVXR/
With the demise of OSXFUSE, serving NFS is the most practical direction for supporting FUSE on OSX. Doing so also reduces platform divergence, as we can test this without requiring Apple hardware.
From a brief look, it seems that implementing a bridge for
fuse.RawFileSystem
would involve a lot of back and forth (de)serialization, so serving directly from thefs
package is probably preferable.Maybe look at buildbarn , which seems to have a NFS server?
@EdSchouten - any suggestions? Are there specific pieces of buildbarn that I should be looking at?