iterative / dvc-ssh

SSH/SFTP plugin for dvc
Apache License 2.0
1 stars 2 forks source link

remote: ssh: add support for `shared = group` #15

Open gwerbin opened 4 years ago

gwerbin commented 4 years ago

Filesystem permissions can get messy when multiple users are accessing the same DVC remote over SSH and/or the local filesystem.

It could save some headache to be able to say dvc remote add --shared group, to obtain behavior analogous to dvc config cache.shared group. This would cause files in the remote to have the same 444 permissions as always, but it would make sure that directories get 775 instead of the default 755.

As suggested on Discord, the resulting config file would look like this:

[remote myssh]
url = ssh://example.com/path
shared = group

Discord context: https://discordapp.com/channels/485586884165107732/563406153334128681/712724033086685232

marius311 commented 3 years ago

Just figured I'd point out if anyone runs into this that a current workaround is to use Linux ACL (where supported) to force folders in the remote cache to be created with group read/write permissions:

setfacl -R -d -m g:groupname:rwX /path/to/remote/cache

where groupname is replaced with your actual group's name. See e.g. here for more info.

gwerbin commented 3 years ago

@marius311 +1, this is exactly what my org ended up doing. I think you will also want to run setfacl -R -m g:groupname:rwX /path/to/remote/cache (without -d) if any files already exist in the cache.

marius311 commented 2 years ago

Is there any possibility this could get looked at? Afaict (and please let me know if I'm missing something) this is still an issue that makes using SSH remotes prohibitive.

Just to summarize the problem, consider two users who have access to a remote server that will be used as a DVC SSH remote. Both users have primary group mygroup on the remote server. If user1 pushes some DVC files to this remote, files and folders will be created owned by group mygroup but with permissions according to that users umask, which depending on how prohibitive it is, may still render these files unreadable or undeletable by user2.

Whether the remote folder has sticky bit g+s set is irrelevant (despite suggested as a solution in some similar Issues), as the users umask will ultimately decide the permissions. The only true solution I've found is the setfacl thing above, or to hound your users to set their umask, but a new user may easily accidentally push some now-undeletable files. Something like shared = group which exists for local caches but which would work for SSH remotes would be perfect.

efiop commented 2 years ago

Unfortunately, we don't have capacity to work on this yet 🙁

johnyaku commented 2 years ago

Our ssh remote is on a host where we do not have setfacl permissions, so I'd like to add my support to this feature request. I imagine that is quite common with academic compute infrastructure.

mfakaehler commented 1 year ago

Hi everyone, we are facing the same issue in my organisation. I tried @marius311's and @gwerbin's solution with the setfacl command, but somehow dvc push still creates files and directories in my remote storage with the wrong permissions. Have you also set an ACL mask property like this?

setfacl -d -m mask:07 <my-storage>

Which mask property would I need to set, precisely?