Open SteVwonder opened 4 years ago
What is the use case for user prolog/epilog? (i.e. is there a better way to do what the user's need?)
That's a great question. TBH, I missed the full use-case. Something related to tools cleanup. Maybe @dongahn can summarize the use case better than me.
Olaf Faaland wants to make use of prologue and epilogue scripts on Elmerfudd to:
(1) Run a script to clean up /dev/shm after a job, so that a user who writes data there doesn't reduce the amount available to the next user.
(2) Drop caches after a job, e.g. echo 3 >/proc/sys/vm/drop_caches
My interest in this is primarily to ensure that data and metadata written to a remote file system such as Lustre is flushed to disk before the node is made available to other users. This is partially so that we find out about a problem as early as possible and minimize damage done, and partially so that one user can't hurt the following user's performance.
(3) Run a script to set up and destroy a local ephemeral file systems, for use by the user.
One example is connecting to remote NVME via nvme-over-fabrics, formatting the connected device with a file system such as xfs, and setting permissions so that the user can write to it; and then un-doing that after the job is complete.
(4) Run a script to set up and destroy a shared ephemeral file systems, for use by the user.
Another example is setting up and destroying a shared GFS2 file system. Unlike the local file system setup/destroy case, this would likely need to know the set of nodes participating in the job.
Olaf Faaland wants to make use of prologue and epilogue scripts on Elmerfudd to:
(1) Run a script to clean up /dev/shm after a job, so that a user who writes data there doesn't reduce the amount available to the next user. (2) Drop caches after a job, e.g. echo 3 >/proc/sys/vm/drop_caches My interest in this is primarily to ensure that data and metadata written to a remote file system such as Lustre is flushed to disk before the node is made available to other users. This is partially so that we find out about a problem as early as possible and minimize damage done, and partially so that one user can't hurt the following user's performance. (3) Run a script to set up and destroy a local ephemeral file systems, for use by the user. One example is connecting to remote NVME via nvme-over-fabrics, formatting the connected device with a file system such as xfs, and setting permissions so that the user can write to it; and then un-doing that after the job is complete. (4) Run a script to set up and destroy a shared ephemeral file systems, for use by the user. Another example is setting up and destroying a shared GFS2 file system. Unlike the local file system setup/destroy case, this would likely need to know the set of nodes participating in the job.
I'm absolutely open to other/better ways to accomplish those tasks.
Unfortunately a job-shell plugin won't work for any of these use cases since it runs as the user of the job, not a privileged process.
We do have support in the IMP (setuid helper) for job prolog and epilog which run as root, but the exec system doesn't have support for invoking the prolog/epilog yet, since that was waiting until the Big Rewrite:tm: #3346.
If this is high priority, it might just be a couple days to a week of work to support prolog/epilog in the current job-exec module.
Unfortunately a job-shell plugin won't work for any of these use cases since it runs as the user of the job, not a privileged process.
We do have support in the IMP (setuid helper) for job prolog and epilog which run as root, but the exec system doesn't have support for invoking the prolog/epilog yet, since that was waiting until the Big Rewrite™️ #3346.
If this is high priority, it might just be a couple days to a week of work to support prolog/epilog in the current job-exec module.
We could perhaps work around the "job-shell plugin runs as a user" issue with some creating sudo.d and scripting, but I wonder if prolog/epilog support isn't important for other testing.
Addressing those use cases somehow is definitely required for us to use elmerfudd with flux.
@ofaaland @jameshcorbett - I moved the discussion over to #2205 since these use cases require full prolog/epilog support and this issue is about a "user prolog/epilog"
Per a conversation on Slack with @grondo and @dongahn: