Closed popescu-v closed 1 month ago
why I choose to set --allow-run-as-root int khiops-env: OpenMPI strongly discourage to run mpiexec as root. It is so discouraged that the user gets this error if khiops is run as root:
mpirun has detected an attempt to run as root.
Running as root is strongly discouraged as any mistake (e.g., in defining TMPDIR) or bug can result in catastrophic damage to the OS file system, leaving your system in an unusable state.
We strongly suggest that you run mpirun as a non-root user.
You can override this protection by adding the --allow-run-as-root option to your command line. However, we reiterate our strong advice against doing so - please do so at your own risk.
I'm not sure that any data scientist will understand this message. So I decided to add the flag --allow-run-as-root. When we launch khiops, the goal is that the scripts khiops and khiops-env run everywhere without any problem. If we decide to remove this flag we have to had some documentation about this troubleshooting.
I agree, it is very bad to run applications as root, but IMHO we don't have to manage that at the khiops level. Besides, If we use mpich or intelMPI or if we launch khiops with 1 proc, there's nothing to stop us from launching khiops as root.
And finally: what is the real risk of running khiops as root? (khiops never destroys files or directories it has not created itself)
Guys, let me clarify what this mysterious "--allow-run-as-root" is actually doing. If the user launching the mpirun command is root, it will propagate this user id to the launched sub-processes. If the user is a normal user, the children will be launched with his user id. Nothing more, nothing less. Demo - I have a folder owned by root that nobody can read except root:
(base) nmms4680@yd-cnd7506qkx:~/workspace/test$ tree . . ├── rootdir [error opening dir] └── userdir
2 directories, 0 files
(base) nmms4680@yd-cnd7506qkx:~/workspace/test$ sudo tree . ├── rootdir │ └── testfile └── userdir
2 directories, 1 file
Now if i do
(base) nmms4680@yd-cnd7506qkx:~/workspace/test$ mpirun -n 1 -q --allow-run-as-root tree . . ├── rootdir [error opening dir] └── userdir
2 directories, 0 files
If I run the same command as root:
(base) nmms4680@yd-cnd7506qkx:~/workspace/test$ sudo mpirun -n 1 -q --allow-run-as-root tree . . ├── rootdir │ └── testfile └── userdir
2 directories, 1 file
CQFD like we say in french.
So for "normal" users e.g. datascientists there is no risk at all with this option. For strange users who like to play around as root, does it make sense for khiops to try to educate them?
To be clear: it is almost always a bad idea to run anything decently complicated as root. As the openmpi authors stated in https://github.com/open-mpi/ompi/issues/4451#issuecomment-416238568 we can never guarantee that our software is completely bug-free.
In this case the main problem is that we are silently hiding the fact that we are propagating elevated privileges via mpirun, which is not maintained by us.
To put ourselves on the safe side I'd do the following:
The above would protect the user from unknowingly launching the program as root and hide the implementation details (OpenMPI) which the user doesn't really care about.
My 2 cents: I think that running with --allow-as-root
is taking an avoidable,and potentially unbounded, risk, since as @sgouache said we cannot guarantee that our software is 100% bug free.
It is clear that for UX is not ideal, as the message is cryptic. But, the users that will get this message are in Linux (and potentially within Docker). So I think we can assume that they are advanced ones.
I would simply run the application without --allow-as-root
and document in the site that in case of obtaining the message there are two ways of resolve: either changing the mpiexec
parameters or set the related env variables. This should suffice for advanced users. This has the advantage of not having anything to maintain. The option of SG of not Khiops running as root with opt-out is ok for me as well, but I would prefer less maintainance.
Finally, as for the argument "but mpich let us do it", not because before we were at risk, doesn't mean we should continue to be. In that sense SG would also lift the risk in that case.
Whatever the case I vouch for not running with --allow-as-root
by default.
Bilan des options Khiops dédiées MPI
Ne pourrait pas se contenter uniquement de l'option existante KHIOPS_MPI_COMMAND?
Eh oui, Khiops n'est pas 100% bug free, mais on peut le lancer directement sans MPI (cf. coclustering) , avec MPICH, OpenMPI, sur desktop, via docker.... On ne va pas rappeler à chaque fois à l'utilisateur que lancer en root, c'est risqué?
I also agree with @sgouache that, ideally, Khiops itself should refrain, by default, from being run as root, and that the effects of this on the concrete MPI backend (or its lack thereof, for sequential runs) should be, by default, transparent to the end-user.
But, meanwhile, here are my 2 cents:
/
directory are deleted)mpiexec
as root; instead, IMHO we would:OMPI_ALLOW_RUN_AS_ROOT
and OMPI_ALLOW_RUN_AS_ROOT_CONFIRM
environment variables to 1
; thus, in these Dockers, Khiops can be run as root
through the OpenMPI mpiexec
command;OMPI_ALLOW_RUN_AS_ROOT
and OMPI_ALLOW_RUN_AS_ROOT_CONFIRM
environment variables to 1
, or (more involved)KHIOPS_MPI_COMMAND
environment variable so that the --allow-run-as-root
is contained therein; but this is more complex, as it requires to override all options to the mpiexec
command.mpich
allows running Khiops as root by default is somewhat fragilized IMHO by the fact that currently mpich
is used, by default, only in Conda environments that are, by definition, sandboxed to a great extent, thus the risks would be somewhat minimized (but not completely nullified I think, because e.g. a buffer overrun could still overwrite random locations on the drive AFAIU). Moreover, running Conda as a root user in other contexts than Docker containers (and for these, rather the system-wide install would be used) - I fail to see a relevant use-case for that.OMPI_*
environment variables and the rationale of it should not be too mysterious).This discussion has brought to light important security and user experience considerations. UX must remain a priority. However, the fact that Khiops propagates its privileges to OpenMPI, a third-party library, amplifies the risks associated with running processes as root. Indeed, bugs or vulnerabilities in a component external to our could have serious consequences for the whole system, impacting our users.
The proposal to introduce a Khiops-specific flag strikes a balance between security and usability. It empowers users to make informed choices acknowledging the associated risks, especially while OpenMPI naturally inherits the same root privileges as Khiops (what we do to prevent cryptic errors difficult for Khiops users to interpret). Plus, this choice can be made without consulting the doc as we would display a clear message if user executes Khiops in root mode.
Therefore, I vote for the following solution:
--allow-run-as-root
flag from the khiops-env scripts.KHIOPS_ALLOW_RUN_AS_ROOT
) that users can explicitly set to enable root execution.KHIOPS_ALLOW_RUN_AS_ROOT
flag is set, include the --allow-run-as-root
flag in the mpi
command.The kind of questions that helped me make this decision:
OK for me, with the small caveat that, instead of adding the --allow-run-as-root
option to the MPI command, thus modifying it, I'd rather set the 2 environment variables OMPI_ALLOW_RUN_AS_ROOT=1
and OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1
if the user sets KHIOPS_ALLOW_RUN_AS_ROOT=true
and OpenMPI is used. If MPICH is used instead of OpenMPI (e.g. in Conda environments), then setting KHIOPS_ALLOW_RUN_AS_ROOT
would have no effect.
Indeed, setting the 2 OMPI_*
environment variables mentioned above would have the advantage that the MPI command itself would not be altered automatically by this configuration. Thus, the user could still fully override the MPI command (as it is now the case) via KHIOPS_MPI_COMMAND
independently of running MPI as root or not.
I agree with @popescu-v.
Another problem with KHIOPS_ALLOW_RUN_AS_ROOT
is that it doesn't scale: If we encounter another security/UX issue we would have to add KHIOPS_ENABLE_PROBLEMATIC_FLAG
.
I argue that it suffices to document the solutions
OMPI_
variablesKHIOPS_MPI_COMMAND
The solution has zero development cost and little documentation cost.
I have to propose what I believe will offer a more consistent UX:
The benefits of the above solution:
Drawbacks:
I close the discussion. A new related issue #344 is created to implement the changes.
Description
Currently, the
--allow-run-as-root
OpenMPImpiexec
option is hard-coded in thekhiops-env
scripts for native installations of the Khiops binaries on Linux operating systems (Ubuntu, Rocky).Or, this is not recommended by the OpenMPI maintainers, mostly for security reasons:
The goal of this issue is to settle on the final approach to the usage of this option.
Questions/Ideas
OMPI_ALLOW_RUN_AS_ROOT
andOMPI_ALLOW_RUN_AS_ROOT_CONFIRM
environment variables to1
before callingmpiexec
. This alternative can be documented in the official documentation, e.g. for users who wish to set-up Docker containers on their own.khiops-env
script if they are not already defined in the environment. Thus one would give the user the possibility to set these two environment variables to0
so that this option is disabled.Context