rocker-org / rocker

R configurations for Docker
https://rocker-project.org
GNU General Public License v2.0
1.45k stars 273 forks source link

Starting rocker/rstudio with password in Env. variable #455

Closed sdeboudt closed 3 years ago

sdeboudt commented 3 years ago

I was just wondering if it is wise to spin up containers with passwords provided in an environmental variable. From the documentation, if I am right this is the only way to start an RStudio container if we do not disable authentication with DISABLE_AUTH=true. If so, by inspecting the running container (docker inspect ) the password is revealed at a glance in the Config section. As such, this approach is only useful in a local development environment, or am I missing something? I am fan and use the rocker/rstudio image frequenlty.

cboettig commented 3 years ago

@sdeboudt Good question. I'm not a security expert but a few thoughts.

Anyone with access to docker commands like docker inspect or docker exec on the server hosting the RStudio container already has a very high level of access, right? Presumably they would have to ssh into the host server (or have equivalent access through something like docker machine ?) in order to run such commands? I believe it is very important to secure and limit access to the host machine; end-users should be able to access only via RStudio interface. (Obviously securing the host machine is outside of the role of rocker).

In contrast, a user logging in through the RStudio interface (i.e. in a multi-user environment so presumably using a separate password) cannot see the $PASSWORD environmental variable from RStudio console or RStudio bash terminal.

The usual password considerations on internet still apply, i.e. for remote servers it is better to set up HTTPS domain using a reverse proxy (caddyserver makes this really easy), though I believe RStudio javascript still encrypts passwords on client-side before sending them in. (This applies to general login, whether or not PASSWORD is an env var). It is of course possible to put the RStudio instance behind an additional authentication layer and then simply disable the RStudio-based password auth. (This is, for instance, what our "r-hub" instruction environment at Berkeley does, allowing students to log in with official campus credentials).

In general, environmental variables are widely used to hold secure credentials (GitHub tokens, AWS credentials, etc etc) and users should be conscientious about setting, storing, and managing these appropriately, especially in any multi-tenanted environment. Standard practices in using linux user permission settings judiciously for a single container with multiple users, or the creation of multiple isolated containers on the same host, can both accomplish this.

Workflows in which multiple users first log into a shared host before running docker are typical, I think, only in 'traditional HPC' models, where administrators typically disable docker for this reason. Singularity provides an alternative mechanism for allowing users to launch their own containers from the host in a more secure fashion, and is reasonably simple to deploy on personal / cloud servers as well.

Another arguably more modern route to allow users to deploy their own containers securely without exposing their choice of credentials to other tenants on the system is the use of a system like kubernetes on the host to launch containers on demand (what the r-hub system is doing under the hood).

The rocker/binder images provide another alternative (unfortunately needing a patch first for >=1.3.x RStudio server versions) that disables RStudio auth and generates a login token at runtime. (Users still need access to the host environment in this case)

I'm not a security expert, but I hope this helps and welcome further questions and discussion. In any of these issues, it is useful to be clear about the context, in particular, who has access to the host machine already, and the docker daemon. Ideally this is a single secure administrator, or a secure software solution (kubernetes, singularity, etc) acting on there behalf. If someone has docker inspect powers, then they were effectively already "in".

sdeboudt commented 3 years ago

Indeed @cboettig my question did not provide a clear context or use case, but getting back with a broad spectrum of ideas you got me covered. Thanks for this elaboration!

In our context, we use a shared Docker Engine for development rather than having an isolated one running let's say on a local workstation. Like is the case with R packages containing functions that run OS commands, you see more often now repos addressing R packages that run Docker commands under the hood. These packages typically deal with the infrastructure that needs to be set up in which the code will be dropped and run. All this comes e.g. in a set of R scripts that you run as a recipe. As such, it is more comfortable to have access to a Docker Engine from within RStudio to get an integrated workflow experience. But in our case with a shared engine, that means that if we do not restrict access to the Docker containers, the RStudio users can see the passwords of their peers' containers. And this brought me to the question, because I use this setup constantly w/o leaving RStudio, and I also welcome further discussion if this is of interest.

cboettig commented 3 years ago

yeah, as noted above, if your users have access to docker info they probably also have access to docker exec and can thus enter other containers on the machine as root w/o the password, so I think the original issue is still a bit moot?

sdeboudt commented 3 years ago

Well, yeah in fact logging this as an "issue" was not the real intent, but more a way of opening a discussion possibly. I will hereby close the "Issue".