Closed PopiBrossard closed 2 years ago
Hello,
Interesting. I had to propose a patch in slurm to make the surmstepd internal logic works properly with secured file systems a few years ago. Things were not called in the right order at that time and the spank stack was called after first user file accesses. The patch was accepted and the slurmstepd logic now works properly with auks for traditional jobs. I never had to deal with the get-user-env option of sbatch and so never have encountered that issue. There is a way to insert some spank code in the prolog logic but it seems (Google search only, no code review at that point) to be called after the prolog itself so most probably after the get-user-env logic too. I am pretty sure that only adding code in auks spank plugin to grab ticket in the prolog will not be sufficient, and that's pretty much all I could do from that side. You should open a bug at schedmd 's bugzilla and ask for their view on that. In the mean time, I would recommend to look at a way to get the ticket using the auks cli during the su phase. It should be possible using pam_exec and a script grabbing the ticket for the targeted user using the cli. That's what I would try to do to work around the issue (or ask the users to load their env variables by themselves in their batch script when they need to :)) HTH Matthieu
I've created a bug report here: https://bugs.schedmd.com/show_bug.cgi?id=9400
In the mean time, I'm gonna ask users to avoid using "--get-user-credentials".
If you're okay with it I'm gonna let this issue open, until slurm's dev team gave me a solution, so I could share it here if anybody face the same issue.
Thanks for your help and your advice.
After contacting slurm's support teams, they said "spank handler is called before prolog script and also before get-user-env logic", in response of your second paragraph.
I'm using slurm 17.04, and auks-0.4.4. Slurm's team says upgrading to slurm 20.02 won't change anything about my issue, and the behaviour of slurm.
ok, was not remembering the logic this way, but I have not looked at slurmd internal actions pipeline for a while.
Closing this as I supposed you work around that with your users.
Hi,
I'm trying to use auks with slurm, but can't make it work in a specific case. The user's home in my cluster is mounted using kerberos security, that's why auks is needed here. On a simple use, like
srun ls ~
everything is fine, I can use my home using my Kerberos ticket.But, when using "--get-user-env" with sbatch, the "_run_prolog" don't have access to the home. In other terms, the command
su - my_username
on the node running the job doesn't work in the context of _run_prolog, when trying to execute .bash_profile in my home. Some users of slurm needs to load their environment this way. I feels like auks load credentials only during the real job and not during "_run_prolog" even if it's required.I can see in slurm's logs:
With a pstree -p I can see something like this:
It's maybe because the
su
isn't launched by slurmstepd ?Do you know how to solve this issue ? Is there a configuration parameter I missed ?
Thanks