Open tiborsimko opened 4 years ago
I ended up in this issue after encountering a problem with permissions inside Docker containers being run by REANA.
I understand REANA performs an implicit user-substitution, from root
to, let's say, reana
(that non privileged user identity with UID 1000). Now, I could not find this anywhere on the documentation, and I think it is an important aspect to highlight given that the goal of containerized environment is controlling what and how the code is going to run, and this implicit substitution makes it not possible.
In addition to the necessary documentation changes, I would like to ask: what is the alternative for workflows that require write permissions in certain /
folders?
I can think of three:
/tmp
folder by default (although is seems like a work around).custom
, and use its personal folder (/home/custom/
)The third option resembles what is explained on this part of the User's guide (Supporting arbitrary user IDs section). However, it will not work if REANA substitutes the Docker specified USER
with its non privileged one even if the specified one is not root
(which is something I don't know).
Problem
The user containers sometimes set a certain
WORKDIR
inDockerfile
where temporary files may be written, or they assume that the workflow user can write temporary files in the user's home directory, all perfectly valid assumptions.Before, when REANA was running user workflows with super privileges, this was OK.
Now, when REANA runs user workflows under non-privileged user identity -- when the user identity is "imposed" and the workflow does not know about it -- this may cause write permission problems. For example, uid=0 containers may often have
/
as home, but the REANA runtime uid=1000 user cannot write there.Solution
Several options:
(1) Improve documentation. Document all the best practices that workflows should not assume any particular directory or writing rights, that inputs/outputs of each step should be declared, that temporary files created within the same step (that uses multiple commands) should use runtime workflow workspace as scratch space, that we also have REANA_WORKSPACE_PATH and friends, etc. In theory, this will make workflow more portable even outside of REANA and even outside of container systems to HPC and such. In practice, this would cause a hurdle for users if we only document this.
(2) Improve user home directory management. When starting a workflow step as a user job pod, ensure that the runtime user home drectory is writable. Several possibilities here, for example: (2a) create writable home directory and start commands there, (2b) inject writable WORKDIR for the container, (2c) switch to use runtime workspace by default so that "undeclared" writes will be carried out there, and more. Some of these are more preferable than others, e.g. the last one may be messy regarding best "isolation" practices.
Let's muse about various options IRL together with related UID/GID management for Kubernetes? #269
Notes
P.S. Observed during updating the BMS search example https://github.com/reanahub/reana-demo-bsm-search/issues/14