ansible / awx

AWX provides a web-based user interface, REST API, and task engine built on top of Ansible. It is one of the upstream projects for Red Hat Ansible Automation Platform.
Other
13.97k stars 3.41k forks source link

AWS EC2 inventory ends in failure #12491

Open nasjomach opened 2 years ago

nasjomach commented 2 years ago

Please confirm the following

Bug Summary

AWS EC2 inventory end in failure after some weeks/days using it. Job status: Failed

The job used to run correctly for some time and then at some point ends in failures.

Recreating the the same inventory source with the exact same options, credentials, etc makes it ends in Successful status.

Failed job log last entries: 22.907 DEBUG Adding child group zones to parent all 22.909 INFO Loaded 815 groups, 240 hosts 13k output lines (debug)

Successfull job last entries: 47.597 INFO Inventory import completed for in 34.0s 24k output lines (debug)

This seems to not be happening with: AWX 19.5.0.

AWX version

21.1.0

Select the relevant components

Installation method

kubernetes

Modifications

no

Ansible version

No response

Operating system

No response

Web browser

No response

Steps to reproduce

Creat AWS EC2 inventory source, use it for some time.

Expected results

Inventory source ends in failure.

Actual results

Failed job log last entries: 22.907 DEBUG Adding child group zones to parent all 22.909 INFO Loaded 815 groups, 240 hosts 13k output lines (debug)

Additional information

Recreating the the same inventory source with the exact same options, credentials, etc makes it ends in success status.

sarabrajsingh commented 2 years ago

hey @nasjomach can you copy/paste the output of the API details for this inventory update?

endpoint: /api/v2/inventory_updates/<id>

where <id> is the id number of the job corresponding to the inventory update.

thanks, AWX Team

mick1627 commented 2 years ago

Here the output of a failed inventory update

sarabrajsingh commented 2 years ago

might be a duplicate of - https://github.com/ansible/awx/issues/12530

mick1627 commented 2 years ago

In our case, the pods to do the inventory sync start and finished successfully. The pods run about 15 secondes. Then, the task in AWX stay in "running" mode for about 15 - 20 secondes and finally finished with error.

mick1627 commented 2 years ago

We observed that if we Uncheck the Update options box "Overwrite" on the source inventory, the inventory sync is working. Maybe related to https://github.com/ansible/awx/issues/12277 From a new inventory, we tried to stop an instance, re-run the inventory, the instance stopped is removed from the inventory it works well. We will try the same thing in few days with an host used in several job template.

AlanCoding commented 2 years ago

This is a rare log for us to see:

awx.main.dispatch worker pid:437716 is gone (exit=-9)

https://github.com/ansible/awx/blob/cfc1255812e0dabb49c80e42b99edb2278b8c260/awx/main/dispatch/pool.py#L391

This is very descriptive. The process that was saving the inventory data died.