Closed rokroskar closed 2 years ago
@rokroskar for testing this out on mac/linux/windows is it enough to spin up VMs with the OS and see if I can mount a S3, SFTP and NFS drives on each using the appropriate fuse-based utility? Or are you expecting that we should test how this would potentially work from within the renku CLI in more details?
As for the HPC cluster I guess we would do the same. Just spinning up a HPC cluster is a lot more complicated os it would be nice to just get access to an existing one.
I think VMs are fine. Just keep track of any potentially invasive use-access issues that need to be resolved to make fuse work.
We do have access to HPC clusters at EPFL and ETH. For ETH, anyone with a NETHZ account can log in to euler.ethz.ch (need to be on the VPN).
Ok cool. Here is a list then of publicly available buckets, NFS drives, SFTP that I will use to test:
S3 bucket:
NFS drive:
FTP:
demo
password
sftp demo@test.rebex.net
NFS server setup
Dockerfile
Please note that you should create a text file called test-nfs-share-file.txt
that will be shared.
FROM erichough/nfs-server
COPY test-nfs-share-file.txt /nfs_exports/
VOLUME /nfs_exports
ENV NFS_EXPORT_0='/nfs_exports *(rw,no_subtree_check,anonuid=1001,anongid=1001)'
Build and run commands
docker build -t test-nfs-server:0.0.1 .
docker run -ti --rm --privileged --name nfs_server test-nfs-server:0.0.1
Mounting
docker inspect <container_id>
and looking under Networks.IPAddress
Linux observations
NFS:
nfs-common
library to be installed. I.e. running the mount command would not work without sudo mount ...
./etc/fstab
file in the NFS client as root with an entry like <container_ip>:/nfs_exports /client/mount/location nfs rw,relatime,user,noauto 0 0
. After this only the specified share from the specified server can be mounted without root privileges. I do not think this is easy to do on a random user machine from within the Renku CLI.FTP:
curlftpfs
which uses fuse under the hood. Run sudo apt install -y curlftpfs
curlftpfs test.rebex.net mount_location -o user=demo:password
S3:
sudo apt install s3fs
, goofys can be downloaded as an executables3fs -f giab test_mount -o url=http://s3.amazonaws.com -o public_bucket=1
Windows
rclone
is the only option, rclone additionally requies https://winfsp.dev/rel/ to be installedSince we do not support "regular" Windows I have decided to not pursue this further.
One slight problem with rclone
that is that it requires a configuration to be written in a file so that it can be used. But if we decide to use rclone
I do not think this is a show-stopper but rather an annoyance.
WSL2
WSL1
Mac
NFS:
mount
cli command docs this should work exactly the same as in Linux, and the problem is that this also brings the same limitations as LinuxFTP:
curlftpfs
can be installed through macportsS3:
goofys
or s3fs
to mount a s3 bucket without root privilegesThe biggest problem with Mac is that MacFuse is not open source and brew has stopped maintaining all formulae that require fuse. That means that installing tools to mount things through FUSE in Mac is not that simple and will probably not improve in the near future.
In addition installing MacFuse can be a pain. I currently have it in some weird broken state that I cannot fix. No matter how many times I delete and install it. I am not sure if this is simply me being extremely unlucky or other people have similar experiences. The weirdest part is that after you install fuse I think you need to restart your computer but right after your computer comes up you have to go into the security settings and approve the use of it. If you miss this then fuse will not work.
In summary:
After doing this excercise I am still very worried that adding this feature would require us to troubleshoot Rclone and Fuse installations for different users across many different OS's.
@rokroskar let me know what you think. The TLDR is right above this comment ☝
Thanks for this summary @olevski ... it definitely looks like relying on FUSE on the user installations is going to be a huge liability. Another thing that occurred to me is that for most users they probably have their networked drive mounted by some other means, e.g. for labs they would probably have something like the NAS available via an SMB share or something like that. So in those cases, we don't need to handle the mounting itself but only keep track of where the data is mounted to map it to what is being used for the given project. In the hosted sessions we can probably take care of the mounting and the book-keeping automatically.
So for working locally it should be
To better understand our options when it comes to enabling access to remote storage, we should understand the limitations of using FUSE in various scenarios:
Specifically we need to understand whether there are circumstances under which FUSE may simply not be an option for our users, e.g. under: