jborean93 / pypsrp

PowerShell Remoting Protocol for Python
MIT License
328 stars 49 forks source link

extremely slow connections in ansible with kerberos auth #190

Closed Aureliolo closed 2 weeks ago

Aureliolo commented 3 months ago

Hi,

I tried to use psrp with kerberos and awx/ansible to do our windows server automation, but it seems something somewhere is going quite wrong. since for a simple series of win_ping it takes minutes. with huge delays between each connection.

I am using these connection params: "ansible_connection": "psrp", "ansible_port": "5985", "ansible_psrp_auth": "kerberos", "ansible_psrp_protocol": "http", "ansible_psrp_negotiate_delegate": true, "ansible_psrp_negotiate_service": "http", "ansible_pipelining": "true"

As an example with 3 hosts and 2 ping tasks, some hosts suddenly take like 30 seconds for the psrp to kick in and do connection.

TASK [Echo number 6] ***********************************************************
task path: /runner/project/playbooks/test-ansible-winrm.yml:39
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
Using module file /usr/share/ansible/collections/ansible_collections/ansible/windows/plugins/modules/win_shell.ps1
Pipelining is enabled.
<HOST_3_WITH_FQDN> ESTABLISH PSRP CONNECTION FOR USER: MY_AD_CONNECTION_USER@DOMAIN ON PORT 5985 TO HOST_3_WITH_FQDN
<HOST_3_WITH_FQDN> PSRP OPEN RUNSPACE: auth=kerberos configuration=Microsoft.PowerShell endpoint=http://HOST_3_WITH_FQDN:5985/wsman
Using module file /usr/share/ansible/collections/ansible_collections/ansible/windows/plugins/modules/win_shell.ps1
Pipelining is enabled.
<HOST_2_WITH_FQDN> ESTABLISH PSRP CONNECTION FOR USER: MY_AD_CONNECTION_USER@DOMAIN ON PORT 5985 TO HOST_2_WITH_FQDN
<HOST_2_WITH_FQDN> PSRP OPEN RUNSPACE: auth=kerberos configuration=Microsoft.PowerShell endpoint=http://HOST_2_WITH_FQDN:5985/wsman
Using module file /usr/share/ansible/collections/ansible_collections/ansible/windows/plugins/modules/win_shell.ps1
Pipelining is enabled.
<HOST_1_WITH_FQDN> ESTABLISH PSRP CONNECTION FOR USER: MY_AD_CONNECTION_USER@DOMAIN ON PORT 5985 TO HOST_1_WITH_FQDN
<HOST_1_WITH_FQDN> PSRP OPEN RUNSPACE: auth=kerberos configuration=Microsoft.PowerShell endpoint=http://HOST_1_WITH_FQDN:5985/wsman
PSRP: EXEC (via pipeline wrapper)
PSRP: EXEC (via pipeline wrapper)
PSRP: EXEC (via pipeline wrapper)
<HOST_2_WITH_FQDN> PSRP RC: 0
<HOST_2_WITH_FQDN> PSRP STDOUT: {"start":"2024-08-23 17:31:07.500724","stdout":"6\\r\\n","cmd":"echo 6","stderr":"","changed":true,"rc":0,"delta":"0:00:00.609379","end":"2024-08-23 17:31:08.110103"}\r
<HOST_2_WITH_FQDN> PSRP STDERR: #< CLIXML\r
<Objs Version="1.1.0.1" xmlns="http://schemas.microsoft.com/powershell/2004/04"><Obj S="progress" RefId="0"><TN RefId="0"><T>System.Management.Automation.PSCustomObject</T><T>System.Object</T></TN><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj><Obj S="progress" RefId="1"><TNRef RefId="0" /><MS><I64 N="SourceId">2</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj></Objs>\r
<HOST_2_WITH_FQDN> PSRP CLOSE RUNSPACE: 984C6013-C13B-4CF2-959F-91CA4694441F
changed: [HOST_2] => {
    "changed": true,
    "cmd": "echo 6",
    "delta": "0:00:00.609379",
    "end": "2024-08-23 17:31:08.110103",
    "rc": 0,
    "start": "2024-08-23 17:31:07.500724",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "6\\r\\n",
    "stdout_lines": [
        "6"
    ]
}
<HOST_1_WITH_FQDN> PSRP RC: 0
<HOST_1_WITH_FQDN> PSRP STDOUT: {"start":"2024-08-23 17:31:07.606995","stdout":"6\\r\\n","cmd":"echo 6","stderr":"","changed":true,"rc":0,"delta":"0:00:00.640629","end":"2024-08-23 17:31:08.247625"}\r
<HOST_1_WITH_FQDN> PSRP STDERR: #< CLIXML\r
<Objs Version="1.1.0.1" xmlns="http://schemas.microsoft.com/powershell/2004/04"><Obj S="progress" RefId="0"><TN RefId="0"><T>System.Management.Automation.PSCustomObject</T><T>System.Object</T></TN><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj><Obj S="progress" RefId="1"><TNRef RefId="0" /><MS><I64 N="SourceId">2</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj></Objs>\r
<HOST_1_WITH_FQDN> PSRP CLOSE RUNSPACE: 9CF97639-42D6-445D-98C7-E13AE7518A8A
changed: [HOST_1] => {
    "changed": true,
    "cmd": "echo 6",
    "delta": "0:00:00.640629",
    "end": "2024-08-23 17:31:08.247625",
    "rc": 0,
    "start": "2024-08-23 17:31:07.606995",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "6\\r\\n",
    "stdout_lines": [
        "6"
    ]
}
<HOST_3_WITH_FQDN> PSRP RC: 0
<HOST_3_WITH_FQDN> PSRP STDOUT: {"start":"2024-08-23 17:31:08.324443","stdout":"6\\r\\n","cmd":"echo 6","stderr":"","changed":true,"rc":0,"delta":"0:00:01.093670","end":"2024-08-23 17:31:09.418113"}\r
<HOST_3_WITH_FQDN> PSRP STDERR: #< CLIXML\r
<Objs Version="1.1.0.1" xmlns="http://schemas.microsoft.com/powershell/2004/04"><Obj S="progress" RefId="0"><TN RefId="0"><T>System.Management.Automation.PSCustomObject</T><T>System.Object</T></TN><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj><Obj S="progress" RefId="1"><TNRef RefId="0" /><MS><I64 N="SourceId">2</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj></Objs>\r
<HOST_3_WITH_FQDN> PSRP CLOSE RUNSPACE: 5FBD1B5C-59F6-4D00-8510-36C6C99245CB
changed: [HOST_3] => {
    "changed": true,
    "cmd": "echo 6",
    "delta": "0:00:01.093670",
    "end": "2024-08-23 17:31:09.418113",
    "rc": 0,
    "start": "2024-08-23 17:31:08.324443",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "6\\r\\n",
    "stdout_lines": [
        "6"
    ]
}

TASK [Echo number 7] ***********************************************************
task path: /runner/project/playbooks/test-ansible-winrm.yml:42
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
redirecting (type: modules) ansible.builtin.win_shell to ansible.windows.win_shell
Using module file /usr/share/ansible/collections/ansible_collections/ansible/windows/plugins/modules/win_shell.ps1
Pipelining is enabled.
<HOST_3_WITH_FQDN> ESTABLISH PSRP CONNECTION FOR USER: MY_AD_CONNECTION_USER@DOMAIN ON PORT 5985 TO HOST_3_WITH_FQDN
<HOST_3_WITH_FQDN> PSRP OPEN RUNSPACE: auth=kerberos configuration=Microsoft.PowerShell endpoint=http://HOST_3_WITH_FQDN:5985/wsman
Using module file /usr/share/ansible/collections/ansible_collections/ansible/windows/plugins/modules/win_shell.ps1
Pipelining is enabled.
<HOST_2_WITH_FQDN> ESTABLISH PSRP CONNECTION FOR USER: MY_AD_CONNECTION_USER@DOMAIN ON PORT 5985 TO HOST_2_WITH_FQDN
<HOST_2_WITH_FQDN> PSRP OPEN RUNSPACE: auth=kerberos configuration=Microsoft.PowerShell endpoint=http://HOST_2_WITH_FQDN:5985/wsman
Using module file /usr/share/ansible/collections/ansible_collections/ansible/windows/plugins/modules/win_shell.ps1
Pipelining is enabled.
<HOST_1_WITH_FQDN> ESTABLISH PSRP CONNECTION FOR USER: MY_AD_CONNECTION_USER@DOMAIN ON PORT 5985 TO HOST_1_WITH_FQDN
<HOST_1_WITH_FQDN> PSRP OPEN RUNSPACE: auth=kerberos configuration=Microsoft.PowerShell endpoint=http://HOST_1_WITH_FQDN:5985/wsman
PSRP: EXEC (via pipeline wrapper)
<HOST_3_WITH_FQDN> PSRP RC: 0
<HOST_3_WITH_FQDN> PSRP STDOUT: {"start":"2024-08-23 17:31:12.949255","stdout":"7\\r\\n","cmd":"echo 7","stderr":"","changed":true,"rc":0,"delta":"0:00:00.828098","end":"2024-08-23 17:31:13.777353"}\r
<HOST_3_WITH_FQDN> PSRP STDERR: #< CLIXML\r
<Objs Version="1.1.0.1" xmlns="http://schemas.microsoft.com/powershell/2004/04"><Obj S="progress" RefId="0"><TN RefId="0"><T>System.Management.Automation.PSCustomObject</T><T>System.Object</T></TN><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj><Obj S="progress" RefId="1"><TNRef RefId="0" /><MS><I64 N="SourceId">2</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj></Objs>\r
<HOST_3_WITH_FQDN> PSRP CLOSE RUNSPACE: 55FDB4AD-EE39-4A71-B891-75FB0046A31E
changed: [HOST_3] => {
    "changed": true,
    "cmd": "echo 7",
    "delta": "0:00:00.828098",
    "end": "2024-08-23 17:31:13.777353",
    "rc": 0,
    "start": "2024-08-23 17:31:12.949255",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "7\\r\\n",
    "stdout_lines": [
        "7"
    ]
}
PSRP: EXEC (via pipeline wrapper)
PSRP: EXEC (via pipeline wrapper)
<HOST_1_WITH_FQDN> PSRP RC: 0
<HOST_1_WITH_FQDN> PSRP STDOUT: {"start":"2024-08-23 17:31:42.263006","stdout":"7\\r\\n","cmd":"echo 7","stderr":"","changed":true,"rc":0,"delta":"0:00:00.640620","end":"2024-08-23 17:31:42.903626"}\r
<HOST_1_WITH_FQDN> PSRP STDERR: #< CLIXML\r
<Objs Version="1.1.0.1" xmlns="http://schemas.microsoft.com/powershell/2004/04"><Obj S="progress" RefId="0"><TN RefId="0"><T>System.Management.Automation.PSCustomObject</T><T>System.Object</T></TN><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj><Obj S="progress" RefId="1"><TNRef RefId="0" /><MS><I64 N="SourceId">2</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj></Objs>\r
<HOST_1_WITH_FQDN> PSRP CLOSE RUNSPACE: 8620D639-2FBC-4899-A5B0-18997D7110FA
changed: [HOST_1] => {
    "changed": true,
    "cmd": "echo 7",
    "delta": "0:00:00.640620",
    "end": "2024-08-23 17:31:42.903626",
    "rc": 0,
    "start": "2024-08-23 17:31:42.263006",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "7\\r\\n",
    "stdout_lines": [
        "7"
    ]
}
<HOST_2_WITH_FQDN> PSRP RC: 0
<HOST_2_WITH_FQDN> PSRP STDOUT: {"start":"2024-08-23 17:31:42.281756","stdout":"7\\r\\n","cmd":"echo 7","stderr":"","changed":true,"rc":0,"delta":"0:00:00.640630","end":"2024-08-23 17:31:42.922386"}\r
<HOST_2_WITH_FQDN> PSRP STDERR: #< CLIXML\r
<Objs Version="1.1.0.1" xmlns="http://schemas.microsoft.com/powershell/2004/04"><Obj S="progress" RefId="0"><TN RefId="0"><T>System.Management.Automation.PSCustomObject</T><T>System.Object</T></TN><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj><Obj S="progress" RefId="1"><TNRef RefId="0" /><MS><I64 N="SourceId">2</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj></Objs>\r
<HOST_2_WITH_FQDN> PSRP CLOSE RUNSPACE: 77DBF312-8D3E-4F07-AD25-7040B70AAD5E
changed: [HOST_2] => {
    "changed": true,
    "cmd": "echo 7",
    "delta": "0:00:00.640630",
    "end": "2024-08-23 17:31:42.922386",
    "rc": 0,
    "start": "2024-08-23 17:31:42.281756",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "7\\r\\n",
    "stdout_lines": [
        "7"
    ]
}

I did a test debug with kerberos and also saw it kept getting new tickets for some reason which might be part of the issue?

Not quite sure on where to properly start debugging this issue tbh. Its also very sporadic. sometimes tasks go quite instant, sometimes they take like 30+ seconds to start really.

I replaced all hostnames etc in the task output above ofc. but all are consistent with each other.

Any help/guidance would be highly appreciated. Using these pip modules and versions in my AWX EE gssapi 1.8.3 krb5 0.6.0 pypsrp 0.8.1

Thank you very much!

jborean93 commented 3 months ago

This is typically a sign with kerberos taking a long time to resolve the host with DNS as before the connection is made it will talk to the KDC to get the TGT (if an explicit password was specified) and service ticket. Unfortunately this is environment specific but things you can look at

Aureliolo commented 3 months ago

Thank you very much for the reply and hint!

Did some tests and I alrady had it harcoded in terms of KDC and admin server. My krb conf i mount to the awx containers:

apiVersion: v1
kind: ConfigMap
metadata:
  name: awx-kerberos-config
  namespace: awx
data:
  krb5.conf: |
    [libdefaults]
      default_realm = MY_DOMAIN
      dns_lookup_realm = false
      dns_lookup_kdc = false
      ticket_lifetime = 24h
      renew_lifetime = 7d
      forwardable = true
      rdns = false
      udp_preference_limit = 0
      default_ccache_name = KEYRING:persistent:ansible
      dns_canonicalize_hostname = fallback
    [realms]
      MY_DOMAIN = {
        kdc = server.MY_DOMAIN
        admin_server = server.MY_DOMAIN
      }
    [domain_realm]
      .MY_DOMAIN = MY_DOMAIN
      MY_DOMAIN = MY_DOMAIN

I changed my awx container to be with version v1.0.0b1 of pypsrp. and since then it works fast. no more delays between tasks. no other changes tough on AD, or krb conf or awx or anything.

If that beta is ok to use in production I guess im just gonna stick to it :)

DNS itself should be stable and i did a debug container where I used manually kinit etc and there everything was perfectly fine and fast even with old version.

jborean93 commented 3 months ago

I changed my awx container to be with version v1.0.0b1 of pypsrp. and since then it works fast. no more delays between tasks. no other changes tough on AD, or krb conf or awx or anything.

That doesn't sound right, the 1.0.0 beta for the pypsrp namespace is still all of the same code so there should be little to no differences when using it with Ansible. The 1.0.0 release is meant to include the new stuff in the psrp namespace but I've been dragging my heels in getting it ready.

Aureliolo commented 2 weeks ago

Well so far it hasn't happened again since i changed that.... don't really thing needed to keep issue open anymore :)