Open didib opened 3 years ago
@mwperina @mnecas
@didib
We haven't dug into this yet, but I did want to mention that if ansible-navigator is used with mode stdout and artifact file creation disabled, it will run in "pure" subprocess mode with out pexpect. (The reason is without artifact file creation, there is no need for the playbook output to be parsed and converted to structure data, so navigator simply hands off the current stdout, in and err to runner)
To clarify: Right now, in our use of ansible-navigator in ovirt-system-tests, we do need artifact creation, so stdout mode isn't useful for us.
ISSUE TYPE
SUMMARY
Sometimes we didn't get expected data in the result of
--playbook-artifact-save-as
. It seems like this might be due to [1]. See also [2].[1] https://github.com/pexpect/pexpect/issues/373
[2] https://pexpect.readthedocs.io/en/stable/commonissues.html#truncated-output-just-before-child-exits
ANSIBLE-NAVIGATOR VERSION
CONFIGURATION
Default
LOG FILE
I spent quite some time looking at this, and failed to find any clue in the log files - everything looks just fine.
STEPS TO REPRODUCE
Not completely sure. The context is [3]. An example command we run there (copied from pytest.log that it generated):
playbook.yml is:
[3] https://github.com/oVirt/ovirt-system-tests/
EXPECTED RESULTS
To get in playbook-artifacts.json information from ovirt_auth
ACTUAL RESULTS
Such information wasn't included, once every tens to low-hundreds times.
ADDITIONAL INFORMATION
This seems to be fixed with [4], which makes ansible-runner use 'subprocess' runner mode instead of the default 'pexpect'.
'Seems' means that when running this locally on my machine in a loop, it didn't fail yet after doing more than 5000 attempts, compared to a few tens to low hundreds until a failure without the patch. I also ran it in CI twice, so far, and in one of them (with 100 attempts) it did fail, but it seems like it might have been due to some other reason, and the other (with 1000 attempts), passed.
Locally, I ran this with a few other patches, which I do not think should have affected [5][6][8]. In particular, [7], which I thought might fix it, didn't - and the generated logs never included more 'ansible-runner event handle' lines after 'Dequeueing last time'.
Bottom line: This seems to be a bug in pexpect, so should most likely simply be fixed there. Incidentally, one workaround suggested there (in [1]), was to wrap your command with 'sh -c', which you seem to try for an unrelated reason on your 'main' branch very recently (not in 1.1). I only noticed this patch now, and it's disabled there by default - didn't try it. I am not suggesting [4] as a solution, because I realize (based on reading the code) that pexpect mode does more - supports passing passwords - which we don't need, but others likely do.
[4] https://github.com/didib/ansible-navigator/commit/671bd6c9b53abaac72eb55483a7858ff5a0b26cd [5] https://github.com/didib/ansible-navigator/commits/debug-ansible-runner [6] https://github.com/didib/ansible-runner/commits/debug-stuff [7] https://github.com/didib/ansible-navigator/commit/491c267993dff09227c589e32c57161bdaff1fb0 [8] https://github.com/didib/ovirt-system-tests/commits/test-ovirt-auth