ansible / ansible-runner

A tool and python library that helps when interfacing with Ansible directly or as part of another system whether that be through a container image interface, as a standalone tool, or as a Python module that can be imported. The goal is to provide a stable and consistent interface abstraction to Ansible.
Other
957 stars 352 forks source link

Ansible-runner fails to write large task output to stdout #1371

Open cdavidson2 opened 4 months ago

cdavidson2 commented 4 months ago

Ansible-runner 2.4.0 Ansible 2.17 Python 3.10.12

I'm using ansible-runner to execute ansible playbooks in a containerized environment. Existing configuration works fine.

Occasionally, ansible-runner will stop writing output to stdout but playbook execution will continue. This behaviour is very consistent and seems to occur when a given ansible task returns a large number of output lines. No errors are thrown, only that the ansible-runner process exits to the terminal prompt without showing the remaining task output or the PLAY RECAP.

artifacts/myjob/stdout file matches what I see on my local terminal. artifacts/myjob/job_events lists all the events correctly, including the large number of output lines.

To troubleshoot, I've invoked the podman command directly using all the same parameters as runner (work dir, volumes, env file, ansible-playbook command) and the container is able to write the entire playbook output to stdout with no problem, so likely not an issue with my ansible configuration or callbacks. Changing the ansible verbosity also doesn't affect this behaviour, leading me to believe it's not really about the amount of text written to stdout.

In this instance, I'm receiving the output of a Powershell Get-Module command, but I experience the same behaviour any time I'm working with a large amount of data in a single task. I don't have an exact threshold, but several thousand lines at least.

Expected Behaviour:

TASK [my_redacted_playbook : IncludeLargeOutputTask] ******
included: /runner/project/path/to/my_redacted_playbook/tasks/largeOutput.yml for MyServer01

TASK [my_redacted_playbook : TaskWithLargeOutput] ******
changed: [MyServer01] => changed=true
  debug: []
  error: []
  host_err: ''
  host_out: ''
  information: []
  output:

  < many output lines - 200KB or so >

  result: {}
  verbose: []
  warning: []

PLAY RECAP ******
MyServer01                : ok=11   changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

user@My-PC:~/ansible$

Actual Behaviour:

TASK [my_redacted_playbook : IncludeLargeOutputTask] ******
included: /runner/project/path/to/my_redacted_playbook/tasks/largeOutput.yml for MyServer01
user@My-PC:~/ansible$
Shrews commented 3 months ago

A few initial questions:

alexchowle commented 3 months ago

Maybe a result of #1343 ?

alexchowle commented 3 months ago

I feel like I'm suffering from this, too - although it's not necessarily due to large output. It's actually very hard to reproduce in my environment - all I can say is that it's a problem when I'm operating in a high network latency environment. What I can see is:

  1. The RECAP exists in the artifacts/{ID}/job_events directory as a -partial.json
  2. The last event I receive on my event_handler has a somewhat malformed stdout - the text is truncated.
    {
    'uuid': '44a3d35c-13ec-43a0-8be1-a28bb21dca2d',
    'counter': 80,
    'stdout': 'ok: [${REDACTED_HOSTNAME}',
    'start_line': 45,
    'end_line': 45,
    'runner_ident': '941bae72-1559-4396-8f86-23712089db3e',
    'event': 'runner_on_ok',
    'pid': 5159,
    'created': '2024-06-25T13:21:05.499209',
    'parent_uuid': '02420a00-00b7-3e4b-2f2c-00000000001a',
    'event_data': {
        ...
        ...
    }
    }

I have redacted some of the actual stdout in the above, but when I don't have the issue it's 'stdout': 'ok: [${HOSTNAME_WITH_FQDN]' e.g. [myhost.example.com]. For clarity:

Broken: 'stdout': 'ok: [myhost' OK: 'stdout': 'ok: [myhost.example.com]'

So the same working event looks more like:

{
    'uuid': '44a3d35c-13ec-43a0-8be1-a28bb21dca2d',
    'counter': 80,
    'stdout': 'ok: [${REDACTED_HOSTNAME}]',
    'start_line': 46,
    'end_line': 47,
    'runner_ident': '941bae72-1559-4396-8f86-23712089db3e',
    'event': 'runner_on_ok',
    'pid': 5159,
    'created': '2024-06-25T13:21:05.499209',
    'parent_uuid': '02420a00-00b7-3e4b-2f2c-00000000001a',
    'event_data': {
        ...
        ...
    }
}
  1. I still get updates on my status callback.
  2. The playbook completes successfully.

My ansible-runner workload runs inside of a container as a package dependency of my Python application.