Closed Aureliolo closed 2 weeks ago
This is typically a sign with kerberos taking a long time to resolve the host with DNS as before the connection is made it will talk to the KDC to get the TGT (if an explicit password was specified) and service ticket. Unfortunately this is environment specific but things you can look at
KRB5_TRACE
to set a file to log all the Kerberos interactions to
KRB5_TRACE=/dev/stdout
will display it inline but if through AWX you might need a file path instead/etc/hosts
file to see if it avoids this problem
kinit
and kvno
to run the Kerberos steps outside of Ansible
kinit username@REALM.COM
to replicate the TGT stepkvno http@hostname.com
to replicate the service ticket stepThank you very much for the reply and hint!
Did some tests and I alrady had it harcoded in terms of KDC and admin server. My krb conf i mount to the awx containers:
apiVersion: v1
kind: ConfigMap
metadata:
name: awx-kerberos-config
namespace: awx
data:
krb5.conf: |
[libdefaults]
default_realm = MY_DOMAIN
dns_lookup_realm = false
dns_lookup_kdc = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
rdns = false
udp_preference_limit = 0
default_ccache_name = KEYRING:persistent:ansible
dns_canonicalize_hostname = fallback
[realms]
MY_DOMAIN = {
kdc = server.MY_DOMAIN
admin_server = server.MY_DOMAIN
}
[domain_realm]
.MY_DOMAIN = MY_DOMAIN
MY_DOMAIN = MY_DOMAIN
I changed my awx container to be with version v1.0.0b1 of pypsrp. and since then it works fast. no more delays between tasks. no other changes tough on AD, or krb conf or awx or anything.
If that beta is ok to use in production I guess im just gonna stick to it :)
DNS itself should be stable and i did a debug container where I used manually kinit etc and there everything was perfectly fine and fast even with old version.
I changed my awx container to be with version v1.0.0b1 of pypsrp. and since then it works fast. no more delays between tasks. no other changes tough on AD, or krb conf or awx or anything.
That doesn't sound right, the 1.0.0 beta for the pypsrp
namespace is still all of the same code so there should be little to no differences when using it with Ansible. The 1.0.0 release is meant to include the new stuff in the psrp
namespace but I've been dragging my heels in getting it ready.
Well so far it hasn't happened again since i changed that.... don't really thing needed to keep issue open anymore :)
Hi,
I tried to use psrp with kerberos and awx/ansible to do our windows server automation, but it seems something somewhere is going quite wrong. since for a simple series of win_ping it takes minutes. with huge delays between each connection.
I am using these connection params: "ansible_connection": "psrp", "ansible_port": "5985", "ansible_psrp_auth": "kerberos", "ansible_psrp_protocol": "http", "ansible_psrp_negotiate_delegate": true, "ansible_psrp_negotiate_service": "http", "ansible_pipelining": "true"
As an example with 3 hosts and 2 ping tasks, some hosts suddenly take like 30 seconds for the psrp to kick in and do connection.
I did a test debug with kerberos and also saw it kept getting new tickets for some reason which might be part of the issue?
Not quite sure on where to properly start debugging this issue tbh. Its also very sporadic. sometimes tasks go quite instant, sometimes they take like 30+ seconds to start really.
I replaced all hostnames etc in the task output above ofc. but all are consistent with each other.
Any help/guidance would be highly appreciated. Using these pip modules and versions in my AWX EE gssapi 1.8.3 krb5 0.6.0 pypsrp 0.8.1
Thank you very much!