allegroai / clearml-server

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Other
381 stars 132 forks source link

Full multi-user support / security #65

Closed fabio12345 closed 3 years ago

fabio12345 commented 3 years ago

Hi, firstly thanks for this - it's a great solution! Are there plans for full multi user support? Currently, there doesn't seem to be a way to separate out access to files. For example, if user A wants to access something that needs authentication they will need to send all the credentials, fully unencrypted, to the server. That's not ideal for multi user scenarios: the server is shared, and all users can potentially access the data. As an example, a secure NAS share. The user would have to mount the share, and do so by sending their credentials to the server; but all the users will be able to see the information about the credentials. Or am I missing something?

Thanks!

bmartinn commented 3 years ago

Hi, firstly thanks for this - it's a great solution!

Thank you, @fabio12345 !

For example, if user A wants to access something that needs authentication they will need to send all the credentials, fully unencrypted, to the server.

What do you mean by "send all the credentials" ? Are you referring to the S3 integration ? or the Login process ? I think I missed the context (or setup) you have in mind :)

fabio12345 commented 3 years ago

Sure, I'll try to be a bit more clear.

In the most generic case, I am thinking of the case when each user has some secret information that they don't want other users to be able to see, but that is needed during the training process, e.g. to access data. That could be login information for some service, or in my use case login information to a network drive. Each user needs to access a network drive, but access to the drive is controlled. As far as I understand, since the user can execute arbitrary python code nothing really prevents them from seeing other user's information on the same machine (for example, from a previous job). I had originally assumed that once the user is executing the code it all happens as the user running trains-agent, with no further user-specific context - is that right? Looks like there is some support for "Cloud access" secrets in the UI: https://allegro.ai/docs/webapp/webapp_profile/ However not sure where in the documentation it is explained how to access them, and whether that supports arbitrary secrets? It would be a very useful feature for a variety of use cases. Also, if the access to the resources happens within a docker container there should be a way of passing down the information to the docker container - which doesn't seem to exist right now.

A related issue I can think about is that the Git credentials seem to be tied to the agent, rather than to the user running the Task.

To sum up, the use case I have in mind is mounting a networked, access controlled network share in e.g. a Docker container, and ensure that access is still controlled.

More generally, there doesn't seem to be a generic way to have per-user secrets accessible by the worker processes based on requesting user.

Thanks again!

bmartinn commented 3 years ago

as I understand, since the user can execute arbitrary python code nothing really prevents them from seeing other user's information on the same machine

Not trivial, but yes that is doable (i.e. accessing the machine's data, obviously it depends on how you setup the trains-agent but it is not straight forward to make sure there is no leak)

Looks like there is some support for "Cloud access" secrets in the UI

This is used by the browser session to access S3:// links, i.e. store access/private keys. Notice these keys are not stored on the trains-server, they are stored on the client's browser session. So on the one hand secure, on the other you cannot access them from code. If you need these credentials in code, they are usually stored here. and are used automatically by the StorageManager , when you access remote urls.

... have per-user secrets accessible ...

I think this is the bottom line, the system is designed to be used by trusted entities in the organization. Actually the idea is to share as much knowledge as possible among users. For example, this is why it is assumed the trains-agent's git credentials have read access for any repository in the organization.

If you are looking into tightening the security of the trains platform, I think you should probably see what they offer on the Enterprise level. These types of control are usually out of scope for most users of trains (I can just imagine the technical hurdles of making it work with additional layers of security ...). I do know they have LDAP connectivity and finer controlled user permissions. I hope it helps :)

fabio12345 commented 3 years ago

Thanks for your reply, very helpful!

If you need these credentials in code, they are usually stored here. and are used automatically by the StorageManager , when you access remote urls.

would those be stored on that file on the client that creates the Task, on on the agent?

bmartinn commented 3 years ago

would those be stored on that file on the client that creates the Task, on on the agent?

Actually on both, and they do not have to be identical. The client has to have its credentials so that you can run your code on your machine and access remote URLs. The agent has its own credentials (maybe with broader access) that are used when code is executed via the agent. Following the security/permission discussion, there is nothing preventing you from abusing those credentials when running remotely ...