Closed jp-hudson closed 2 years ago
John, I'm definitely no expert but maybe first check if kerberos itself is working correctly? Does klist -kt indicates you have a valid keytab? Have a look at this presentation: https://slurm.schedmd.com/slurm_ug_2012/auks-tutorial.pdf It contains a bunch of interesting commands to validate various steps of the configuration. Kind regards, Dries
I agree, please look at the slides, it should help you to understand how to make things work and do the intermediate checks. Without the content of your configuration files, and more Kerberos related conf, it will be difficult to help on that.
Hey there,
Hope this is the right place for this, if not please tell me.
I am trying to bring up auks with our newly installed slurm implementation but I am having a few problems getting the initial services started. I was hoping you could assist, or help point me in the right direction.
Setting it up on: Mgmt node Login node Compute node
Installing first on mgmt node:
Installed auks via RPM's:
auks-0.4.0-1.x86_64.rpm auks-debuginfo-0.4.0-1.x86_64.rpm auks-devel-0.4.0-1.x86_64.rpm auks-slurm-0.4.0-1.x86_64.rpm
And enabled the auks plugin by adding this to plugstack.conf:
optional /usr/lib64/slurm/auks.so default=enabled spankstackcred=yes minimum_uid=1024
Inside the auks.conf file I have configured the:
PrimaryHost PrimaryPrincipal
No secondary
Inside auks.acl file (I am a bit confused here) I have the admin line setup and currently it is setup as myself. I know that this is not correct, should this be the slurm user? Also, I am not entirely sure what to set for the guest and user role, or if they need to be defined.
When trying to start the auksd service it hangs on activating and eventually fails. Looking at the auks.log it shows a failure at the krb5_recvauth step:
Wed May 20 10:32:08 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: connection authentication context initialisation succeed Wed May 20 10:32:08 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: authentication context addrs set up succeed Wed May 20 10:32:08 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: default kstream initialisation succeed Wed May 20 10:32:08 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: kstream basic initialisation succeed Wed May 20 10:32:08 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: keytab initialisation succeed Wed May 20 10:32:08 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: server kstream initialisation succeed Wed May 20 10:32:08 2020 [INFO3] [euid=0,pid=31256] worker[6] : krb5 stream successfully initialized for socket 4 Wed May 20 10:32:08 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: authentication failed : Software caused connection abort Wed May 20 10:32:08 2020 [INFO2] [euid=0,pid=31256] worker[6] : authentication failed on socket 4 (10.232.128.65) : krb5 stream : recvauth stage failed (server side) Wed May 20 10:32:08 2020 [INFO3] [euid=0,pid=31256] worker[6] : incoming socket 4 processing failed Wed May 20 10:32:11 2020 [INFO3] [euid=0,pid=31256] dispatcher: incoming connection (3) successfully added to pending queue Wed May 20 10:32:11 2020 [INFO3] [euid=0,pid=31256] worker[8] : incoming socket 3 successfully dequeued Wed May 20 10:32:11 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: local endpoint stream 3 informations request succeed Wed May 20 10:32:11 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: remote endpoint stream 3 informations request succeed Wed May 20 10:32:11 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: remote host is 10.232.128.65 Wed May 20 10:32:11 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: context initialization succeed Wed May 20 10:32:11 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: connection authentication context initialisation succeed Wed May 20 10:32:11 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: authentication context addrs set up succeed Wed May 20 10:32:11 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: default kstream initialisation succeed Wed May 20 10:32:11 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: kstream basic initialisation succeed Wed May 20 10:32:11 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: keytab initialisation succeed Wed May 20 10:32:11 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: server kstream initialisation succeed Wed May 20 10:32:11 2020 [INFO3] [euid=0,pid=31256] worker[8] : krb5 stream successfully initialized for socket 3 Wed May 20 10:32:11 2020 [INFO4] [euid=0,pid=31256] auks_krb5_stream: authentication failed : Software caused connection abort Wed May 20 10:32:11 2020 [INFO2] [euid=0,pid=31256] worker[8] : authentication failed on socket 3 (10.232.128.65) : krb5 stream : recvauth stage failed (server side)
Aukspriv does not seem happy either and complains that it is unable to get the ccache for the host using the keytab file which I "believe" is a good keytab file but my kerberos knowledge is not very good.
unable to get ccache for host ____ using ktfile /etc/krb5.keytab : kinit: Client not found in Kerberos database while getting initial credentials.
Any suggestions or pointers on where to be looking to resolve this would be so helpful.
Best,
John Hudson
Hey bloodbuzz,
Just wondering if you were ever able to figure this out, I'm running into something similar with my setup on CentOS 8. https://github.com/hautreux/auks/issues/45
closing this, reopen if necessary, but seemed to be related to a krb5 conf / setup issue
Hey there,
Hope this is the right place for this, if not please tell me.
I am trying to bring up auks with our newly installed slurm implementation but I am having a few problems getting the initial services started. I was hoping you could assist, or help point me in the right direction.
Setting it up on: Mgmt node Login node Compute node
Installing first on mgmt node:
Installed auks via RPM's:
auks-0.4.0-1.x86_64.rpm auks-debuginfo-0.4.0-1.x86_64.rpm auks-devel-0.4.0-1.x86_64.rpm auks-slurm-0.4.0-1.x86_64.rpm
And enabled the auks plugin by adding this to plugstack.conf:
optional /usr/lib64/slurm/auks.so default=enabled spankstackcred=yes minimum_uid=1024
Inside the auks.conf file I have configured the:
PrimaryHost PrimaryPrincipal
No secondary
Inside auks.acl file (I am a bit confused here) I have the admin line setup and currently it is setup as myself. I know that this is not correct, should this be the slurm user? Also, I am not entirely sure what to set for the guest and user role, or if they need to be defined.
When trying to start the auksd service it hangs on activating and eventually fails. Looking at the auks.log it shows a failure at the krb5_recvauth step:
Aukspriv does not seem happy either and complains that it is unable to get the ccache for the host using the keytab file which I "believe" is a good keytab file but my kerberos knowledge is not very good.
unable to get ccache for host ____ using ktfile /etc/krb5.keytab : kinit: Client not found in Kerberos database while getting initial credentials.
Any suggestions or pointers on where to be looking to resolve this would be so helpful.
Best,
John Hudson