openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.47k stars 1.74k forks source link

Fuse Implementation #8

Open behlendorf opened 14 years ago

behlendorf commented 14 years ago

There currently exists a popular fuse based implementation of the ZFS for Linux originally started by Ricardo Correia. I would very much like to explore the idea of working with the current ZFS on FUSE/Linux guys to get the same sort of thing working with this code base. After talking with Ricardo I know he likes the idea.

http://zfs-fuse.net/

behlendorf commented 13 years ago

With all the ZPL code modified to be more Linux friendly it should be fairly straight forward to enable building of zfs_vnops.c, zfs_vnops.c zfs_dir.c, and zfsznode in user space. Then equivalents of the zpl* files called fuse_* can be implemented pretty easily. There are three reasons I can think of offhand to support a fuse implementation.

FransUrbo commented 10 years ago

@behlendorf This issue is ancient. Is it still a wish to integrate with the FUSE implementation (that have been lagging and not longer under development as far as I know)?

behlendorf commented 10 years ago

It is still desirable to be able to optionally build the code for FUSE instead of as a kernel module. How exactly we could go about doing that is still up for debate but it would be useful.

ryao commented 10 years ago

One question that needs to be addressed before someone implements an optional FUSE build is how the commands would communicate with it. I recall that ZFS-FUSE had the commands communicate with the daemon via userspace IPC mechanisms. If we implement FUSE support, I think we should use CUSE to allow the FUSE driver to provide the /dev/zfs character device. There is some discussion of techniques here:

http://bryanpendleton.blogspot.com/2011/02/fuse-cuse-and-uio.html

Doing things in this way would mean that the kernel driver and FUSE driver would be mutually exclusive. That is a design decision that simplifies things and unless someone thinks otherwise, is one that I think we should make.

ilovezfs commented 10 years ago

Hmm that does sound like quite a limitation that would punish non-FUSE ZFS users. Anyone with a ZFS root file system would need a separate box or a virtual machine to interact with the FUSE version, so non-admin users on such systems would have no access to ZFS at all, both native and FUSE, but that is precisely what FUSE is supposed to enable.

DeHackEd commented 10 years ago

I believe the current ZFS-FUSE used a named socket, thus /dev/zfs would be a socket rather than a character device.

This had some interesting properties, such as using SCM rights to send a file descriptor to the FUSE process. Send/Recv worked like this eliminating a copy pass for all data to reach the zfs CLI tool's file descriptors as the FUSE process directly sent/received the snapshots and the CLI just waiting for an event indicating it was finished.

Using CUSE does seem more elegant, but I think the socket method still has advantages.

ilovezfs commented 10 years ago

@ryao If it were to use CUSA, is there any reason that the character device couldn't just be named something like /dev/zfs2 or /dev/zfsfuse?

ryao commented 10 years ago

@ilovezfs Naming the character device /dev/zfs-fuse would work. We would need to implement a way to have the tools use that instead of /dev/zfs, but it is definitely doable. Maybe an environment variable could be used. Another way would be to implement a trick where the tools will switch to /dev/zfs-fuse if they are invoked with special names (e.g. zfs-fuse, zpool-fuse).

behlendorf commented 10 years ago

Naming the character device /dev/zfs-fuse would work. We would need to implement a way to have the tools use that instead of /dev/zfs, but it is definitely doable. Maybe an environment variable could be used. Another way would be to implement a trick where the tools will switch to /dev/zfs-fuse if they are invoked with special names (e.g. zfs-fuse, zpool-fuse).

I like this suggestion quite a bit. It would cleanly allow us to concurrently support a kmod and a fuse ZFS implementation with the same tool chain. That's solves a lot of annoying issues. I'd suggest we add an environment variable to optionally set the /dev/zfs device name. Then either install zfs-fuse, zpool-fuse wrappers which set the environment variable. Or just create symlinks and update the utilities to detect the alternate binary name and set the device name appropriately.

maci0 commented 9 years ago

Would you not have to re-implement lots of SPL stuff to work in userspace ?

DeHackEd commented 9 years ago

It's already implmented in userspace. zdb and ztest need it.

Giving this project a try is on my bucket list. There's a lot of questions that come up from the previous ZFS-FUSE implementation - the .zfs directory may not be workable if snapdir=hidden and zvols couldn't be presented as an actual block device. Hacks would be required, but something workable is possible.

maci0 commented 9 years ago

You could jump through some hoops and integrate losetup with some hidden image file on a somewhat hidden zfs dataset. I have no idea if its possible to abstract zvol creation that much. IIRC zvols were completely re-implemented for ZoL so there might actually already be some abstraction in-place. I think effort should go forward to making some common zvol API which has to be provided by a potential implementation (fuse, ZoL, BSD, OSX ....)

DeHackEd commented 9 years ago

I was thinking I could mount a FUSE filesystem on /dev/zvol and provide the volumes but as regular files instead of volumes. It would suffice for most needs. The loopback idea is an interesting one but limited in how many disks are supported so it may be best to leave that to the user.

ryao commented 9 years ago

I gave this some thought because it would enable testing ioctls with the driver in userspace. Here are my thoughts:

The initial implementation should not bother with any of the following bits:

libzpool will need modifications to include more files:

New code would need to be written to replace the module/zfs/zpl_* files by wrapping the Illumos functions.

A daemon would be written (e.g. fuse-zfsd) that links to libzpool and libfuse. It would be a singleton CUSE daemon that creates /dev/zfs-fuse by default, but would support making /dev/zfs. It would only support open() and ioctl(). It would need to use unrestricted ioctls like this:

http://fuse.sourceforge.net/doxygen/cusexmp_8c.html

Code that uses ddi_copyin would be ifdefed to use fuse_reply_ioctl_retry to read data from userspace (via the cuse kernel module) while similarly, code that uses ddi_copyout would be ifdefed to use fuse_reply_ioctl_retry to write the ioctl responses. These use iovecs.

Then the mount protocol would be modified to attempt using an ioctl to mount and fallback to the regular mount command if ENOTSUP is returned. The ioctl would be implemented in the fuse-zfsd daemon to perform a mount via through fuse API. mount.zfs would be modified to use /dev/zfs-fuse when execve'd with the name mount.zfs-fuse.

Killing the CUSE daemon appears to get rid of the character device (from testing the example), but it would be advisable to try umounts on death (should be lazy on Linux) unless we get SIGKILL where we won't be able to respond.

Lastly, it might be advantageous to refactor our ZPL so that we can recycle the ZFS-FUSE project's FUSE VFS hooks. The ZFS-FUSE developers did a fantastic job of wrapping the Solaris VFS interface such that they can use it seemingly unmodified:

https://github.com/gordan-bobic/zfs-fuse/blob/master/src/zfs-fuse/zfs_operations.c https://github.com/gordan-bobic/zfs-fuse/blob/master/src/lib/libsolkerncompat/vnode.c https://github.com/gordan-bobic/zfs-fuse/blob/master/src/lib/libsolkerncompat/vfs.c

If we refactor our ZPL to do the same (i.e. initialize an emulated Solaris VFS, implement VOP* macros, call them from zpl*.c, make ITOZ into VTOZ again, etcetera), we make importing code from Illumos easier.

Conan-Kudo commented 7 years ago

A few months back, @ryao, @behlendorf, and I discussed the possibility of having this so that an up to date fully userspace implementation of ZFS would be available. @ryao mentioned he'd take a crack at this.

@ryao, have you managed to make any progress on it?

Conan-Kudo commented 5 years ago

As another point of interest for having ZFS work as a FUSE filesystem: on Linux with kernel 4.18 and higher, you can use FUSE completely unprivileged. This facility was added to support running containers fully unprivileged, as there's now an implementation of OverlayFS with FUSE.

This makes it much easier to use ZFS "everywhere" if people want to, including for container engines and inside containers.

RHEL 8 will support this facility, and I expect SLE 15 will in a future service pack.

mskarbek commented 5 years ago

As another point of interest for having ZFS work as a FUSE filesystem: on Linux with kernel 4.18 and higher, you can use FUSE completely unprivileged.

What would be really interesting is an ability to fuse-mount certain datasets on a pool otherwise imported and mounted by kernel module - I would like to have root-on-zfs system with datasets under /home/<user>/.local/share/containers/storage and podman able to create containers using the fuse in an unprivileged mode just like I am able to do right now with overlayfs-fuse.

devZer0 commented 4 years ago

As another point of interest for having ZFS work as a FUSE filesystem: on Linux with kernel 4.18 and higher, you can use FUSE completely unprivileged.

What would be really interesting is an ability to fuse-mount certain datasets on a pool otherwise imported and mounted by kernel module - I would like to have root-on-zfs system with datasets under /home/<user>/.local/share/containers/storage and podman able to create containers using the fuse in an unprivileged mode just like I am able to do right now with overlayfs-fuse.

i think that's not possible, as zfs does all the disk/blockdevice management, you cannot stack two instances of zfs on top of each other,i.e. make some datasets available to one instance and some on the other...

grahamperrin commented 1 year ago

https://github.com/openzfs/zfs/issues/8#issuecomment-129561544 (2015):

… A daemon would be written (e.g. fuse-zfsd) that links to libzpool and libfuse. It would be a singleton CUSE daemon that creates /dev/zfs-fuse by default, but would support making /dev/zfs. It would only support open() and ioctl(). …

Drive-by comment: maybe a different name for the daemon?

To avoid any possible (FreeBSD) end user confusion with zfsd.

digitalsignalperson commented 1 month ago

with fuse passthrough mode in linux 6.9 a zfs fuse implementation might be more performant than it could have been in the past

image image from https://source.android.com/docs/core/storage/fuse-passthrough