Open jmuchovej opened 4 years ago
Hey @ionlights , thanks for the detailed feature request!
We really appreciate your effort to make MLHub adaptable for more scenarios.
One remark with regards to the --user
flag: in case you refer to the user who is used within the started workspace container, here is a related issue: https://github.com/ml-tooling/ml-workspace/issues/11
Currently, all processes (tools, scripts etc.) within the workspace container are executed as the root user. We have not looked into this yet and I am not sure whether we can do so soon. But perhaps this note helps you.
Hmm... as far as I understand --user
just maps root
inside the container to "my" system-wide $UID
/$GID
. I'm not sure it makes a big difference that ml-workspace currently runs the root user by default. (Definitely not ideal, but that's a "fundamental limitation" of Docker, [at least] last I checked.)
Just to be clear, I was referring to mimicking PAMAuthenticator
and SystemUserSpawner
. With a possible addition of the --user
flag when spinning up the containers – mostly so multi-user systems don't fall into any kind of "permissions hell."
I thought that this is the user used within the container. Hence, if something inside the container needs root permissions, it might not work, but I have no experience with the --user
flag and could be wrong here.
Besides that, making those functionalities (like PAMAuthenticator
and SystemUserSpawner
) compatible with ml-hub and ml-workspace would be great!
Feature description: Broadly: Support for
PAMAuthenticator
,SystemUserSpawner
, and--user $UID:$GID
flags.Tying these together, this would allow
ml-hub
to take advantage of local system users. The primary benefit of this is that in a setting where each user can log in and spin up their ownml-workspace
, they now have a way to tie into their home directory on the host file-system. This allows for a single-location, transportable configuration across multiple workspaces, in the cases where a workspace is used as a "project sandbox" (if you will).Problem and motivation:
ml-hub
for a bit and it's great, but I (and other users on my system) find that we're setting up our shell configurations (and cloning projects) quite a bit.ml-hub
to work from local datasets (e.g. someone working on YouTube-8M – it's difficult to redownload the entire dataset in a reasonable timeframe.)ml-hub
.ml-workspaces
is more transparently accessible.ml-hub
– specify dataset repositories.ml-hub
supports local user mappings, and if there's a way to port this tosingularity
, HPCs could be interested in using this along with some smaller teams of ML researchers/developers.Is this something you're interested in working on? Yea! I was planning to do some digging later this week to figure out how challenging an implementation is would be.