Open soxofaan opened 1 year ago
This discussion seems relevant: https://github.com/jcmturner/gokrb5/issues/189
I can confirm this was an issue with one of our KDCs being unresponsive.
So seems like kind of network/connection issues we are also seeing with connections to Elastic Search from the job trackers
@tcassaert have you seen this in other applications that use Vault?
(cc @bossie you might also be interested in this thread)
Just saw another kind of failure from same location:
openeogeotrellis.vault.VaultLoginError: Vault login (Kerberos) failed:
Command '['vault', 'login', '-address=https://vault.....be', '-token-only',
'-method=kerberos', 'username=openeo', 'service=vault-prod',
'realm=...BE', 'keytab_path=openeo.keytab', 'krb5conf_path=/etc/krb5.conf']'
returned non-zero exit status 2.. stderr: "
Error authenticating: couldn't initialize context:
[Root cause: Networking_Error] Networking_Error: TGS Exchange Error:
issue sending TGS_REQ to KDC: failed to communicate with KDC.
Attempts made with TCP (error in getting a TCP connection to any of the KDCs)
and then UDP (error sending to a KDC: error sneding to ipa02.....be:88:
sending over UDP failed to 192.168.207.28:88:
read udp 172.17.0.4:47658->192.168.207.28:88:
i/o timeout; error sneding to ipa01.....be:88:
sending over UDP failed to 192.168.207.29:88:
read udp 172.17.0.4:49111->192.168.207.29:88: i/o timeout)"
also indicating network/connectivity issues
This might be caused by VPN connectivity issues. I will check if I can find anything.
the minutely job tracker runs fail sometimes (somewhat in bursts) with this error: