ansible / awx

AWX provides a web-based user interface, REST API, and task engine built on top of Ansible. It is one of the upstream projects for Red Hat Ansible Automation Platform.
Other
14.12k stars 3.44k forks source link

Error in job log output #12233

Open ybucci opened 2 years ago

ybucci commented 2 years ago
ISSUE TYPE
SUMMARY

Some outputs don't load

ENVIRONMENT
STEPS TO REPRODUCE

Run job

EXPECTED RESULTS

Correctly show the output log

ACTUAL RESULTS

Can't se logs without download them

ADDITIONAL INFORMATION

image

fosterseth commented 2 years ago

possible duplicate of https://github.com/ansible/awx/issues/12194

fosterseth commented 2 years ago

@ybucci

do you mind sharing the first 10 events of your job output? please remove any sensitive information beforehand

also, can you share the output of api/v2/jobs/2119/job_events/children_summary

ybucci commented 2 years ago

Hi @fosterseth

First 10 events:

`No config file found; using defaults

PLAY [iis_sincro] **

TASK [Gathering Facts] ***** ok: [OMITED]

Running win_updates - round 1 TASK [wu : Checking for Available Windows Updates] ***************************** ok: [OMITED] => {"changed": false, "failed_update_count": 0, "filtered_updates": {}, "found_update_count": 0, "installed_update_count": 0, "reboot_required": false, "updates": {}} TASK [netscaler_disable_server : Call API Nitro DISABLE server PDVHavan in LoadBalancer] *** skipping: [OMITED] => (item=OMITED) => {"ansible_loop_var": "item", "changed": false, "item": "OMITED", "skip_reason": "Conditional result was False"} TASK [netscaler_disable_server : Call API Nitro DISABLE server PDVHavan in LoadBalancer] *** skipping: [OMITED] => (item=OMITED) => {"ansible_loop_var": "item", "changed": false, "item": "OMITED", "skip_reason": "Conditional result was False"} TASK [netscaler_disable_server : Call API Nitro DISABLE server Sistema in LoadBalancer] *** skipping: [OMITED] => (item=OMITED) => {"ansible_loop_var": "item", "changed": false, "item": "OMITED", "skip_reason": "Conditional result was False"} TASK [netscaler_disable_server : Call API Nitro DISABLE server Sistema in LoadBalancer] *** skipping: [OMITED] => (item=OMITED) => {"ansible_loop_var": "item", "changed": false, "item": "OMITED", "skip_reason": "Conditional result was False"} TASK [netscaler_save_config : Save Config NetScaler] *************************** skipping: [OMITED] => (item=OMITED) => {"ansible_loop_var": "item", "changed": false, "item": "OMITED", "skip_reason": "Conditional result was False"} skipping: [OMITED] => (item=OMITED) => {"ansible_loop_var": "item", "changed": false, "item": "OMITED", "skip_reason": "Conditional result was False"} TASK [wu : Timeout - 1 minuto] ************************************************* skipping: [OMITED] => {"changed": false, "skip_reason": "Conditional result was False"} TASK [wu : Apply Windows update] *********************************************** skipping: [OMITED] => {"changed": false, "skip_reason": "Conditional result was False"} TASK [wu : Restart Service Datadog] ******************************************** skipping: [OMITED] => {"changed": false, "skip_reason": "Conditional result was False"} TASK [netscaler_enable_server : Call API Nitro ENABLE server PDVHavan in LoadBalancer] *** skipping: [OMITED] => (item=OMITED) => {"ansible_loop_var": "item", "changed": false, "item": "OMITED", "skip_reason": "Conditional result was False"} ` **Output of api/v2/jobs/2119/job_events/children_summary** `{"children_summary":{"2":{"rowNumber":1,"numChildren":256},"3":{"rowNumber":2,"numChildren":235},"4":{"rowNumber":3,"numChildren":2},"7":{"rowNumber":6,"numChildren":15},"11":{"rowNumber":10,"numChildren":15},"15":{"rowNumber":14,"numChildren":15},"19":{"rowNumber":18,"numChildren":15},"23":{"rowNumber":22,"numChildren":15},"27":{"rowNumber":26,"numChildren":39},"32":{"rowNumber":31,"numChildren":11},"35":{"rowNumber":34,"numChildren":11},"38":{"rowNumber":37,"numChildren":11},"41":{"rowNumber":40,"numChildren":15},"45":{"rowNumber":44,"numChildren":15},"49":{"rowNumber":48,"numChildren":15},"53":{"rowNumber":52,"numChildren":15},"62":{"rowNumber":61,"numChildren":11},"65":{"rowNumber":64,"numChildren":3},"66":{"rowNumber":65,"numChildren":2},"127":{"rowNumber":126,"numChildren":3},"128":{"rowNumber":127,"numChildren":2},"189":{"rowNumber":188,"numChildren":3},"190":{"rowNumber":189,"numChildren":2},"251":{"rowNumber":250,"numChildren":6},"252":{"rowNumber":251,"numChildren":2},"255":{"rowNumber":254,"numChildren":2}},"meta_event_nested_uuid":{"9":"c21c6c27-a77b-b1d1-c1da-00000000003a","71":"c21c6c27-a77b-b1d1-c1da-00000000003a","133":"c21c6c27-a77b-b1d1-c1da-00000000003a","195":"c21c6c27-a77b-b1d1-c1da-00000000003a"},"event_processing_finished":true}`
fosterseth commented 2 years ago

@ybucci I suspect it has to do with the highly nested events, what control flow mechanism is this playbook using (e.g. are you looping over items, looping over tasks, etc)

any hints as to how your play is organized might help me recreate the bug locally

ybucci commented 2 years ago

Hi @fosterseth

I think the problem is when import_role is used:

See below the playbook

- name: Checking for Available Windows Updates
  win_updates:
    state: searched
  register: available_updates

- name: Disable server "{{ ansible_hostname }}"_Sistema/PDVHavan in LoadBalancer
  import_role:
    name: netscaler_disable_server
  when: (available_updates.found_update_count|int >= 1)

- name: Timeout - 1 minuto
  win_wait_for:
    timeout: 60
  when: (available_updates.found_update_count|int >= 1)

- name: Apply Windows update
  include_tasks: wu.yml
  when: (available_updates.found_update_count|int >= 1)

- name: Restart Service Datadog
  win_service:
    name: datadogagent
    state: restarted
    force_dependent_services: yes
  ignore_errors: yes
  when: (available_updates.found_update_count|int >= 1)

- name: Enable server "{{ ansible_hostname }}"_Sistema/PDVHavan in LoadBalancer
  import_role:
    name: netscaler_enable_server
  when: (available_updates.found_update_count|int >= 1)

- name: Timeout - 5 minutos
  win_wait_for:
    timeout: 300
  when: (available_updates.found_update_count|int >= 1)

- name: Ping in Healthchecks.io
  delegate_to: localhost
  uri: 
    url: "{{ cronmon_url }}"

And role netscaler_disable_server:

---

- name: Call API Nitro DISABLE server PDVHavan in LoadBalancer
  delegate_to: localhost
  uri:
    url: "http://{{ nsip_115 }}/nitro/v1/config/server?action=disable"
    method: POST
    status_code: 200
    headers:
      Content-Type: application/json
      X-NITRO-USER: "{{ netscaler_user }}"
      X-NITRO-PASS: "{{ netscaler_password }}"
    body_format: json
    body: '{"server":{"name":"{{ ansible_hostname }}_PdvHavan - {{ item }}"}}'
  register: pdvhavan_reg
  ignore_errors: yes
  loop: "{{ (ansible_ip_addresses | select('match','172.16.') | list) }}"

- name: Call API Nitro DISABLE server PDVHavan in LoadBalancer
  delegate_to: localhost
  uri:
    url: "http://{{ nsip_116 }}/nitro/v1/config/server?action=disable"
    method: POST
    status_code: 200
    headers:
      Content-Type: application/json
      X-NITRO-USER: "{{ netscaler_user }}"
      X-NITRO-PASS: "{{ netscaler_password }}"
    body_format: json
    body: '{"server":{"name":"{{ ansible_hostname }}_PdvHavan - {{ item }}"}}'
  register: pdvhavan_reg
  ignore_errors: yes
  loop: "{{ (ansible_ip_addresses | select('match','172.16.') | list) }}"

- name: Call API Nitro DISABLE server Sistema in LoadBalancer
  delegate_to: localhost
  uri:
    url: "http://{{ nsip_115 }}/nitro/v1/config/server?action=disable"
    method: POST
    status_code: 200
    headers:
      Content-Type: application/json
      X-NITRO-USER: "{{ netscaler_user }}"
      X-NITRO-PASS: "{{ netscaler_password }}"
    body_format: json
    body: '{"server":{"name":"{{ ansible_hostname }}_Sistema - {{ item }}"}}'
  register: sistema_reg
  ignore_errors: yes
  loop: "{{ (ansible_ip_addresses | select('match','172.16.') | list) }}"

- name: Call API Nitro DISABLE server Sistema in LoadBalancer
  delegate_to: localhost
  uri:
    url: "http://{{ nsip_116 }}/nitro/v1/config/server?action=disable"
    method: POST
    status_code: 200
    headers:
      Content-Type: application/json
      X-NITRO-USER: "{{ netscaler_user }}"
      X-NITRO-PASS: "{{ netscaler_password }}"
    body_format: json
    body: '{"server":{"name":"{{ ansible_hostname }}_Sistema - {{ item }}"}}'
  register: sistema_reg
  ignore_errors: yes
  loop: "{{ (ansible_ip_addresses | select('match','172.16.') | list) }}"

- name: Save Config NetScaler
  import_role:
    name: netscaler_save_config

...
fosterseth commented 2 years ago

@ybucci thanks, that is helpful information.

I tried writing a simple playbook that has import_role, where the task in that role each have a loop: construct.

I also see that your netscaler_disable_server role also does an additional import_role. I also tried to mimic that.

However, I still can't replicate the bug.

One thing that stands out and is unexplained is this line in your output skipping: [OMITED] => (item=OMITED) => {"ansible_loop_var": "item", "changed": false, "item": "OMITED", "skip_reason": "Conditional result was False"}

I see ansible_loop_var but I don't see loop_control used anywhere e.g.

  loop_control:
    loop_var: outer_item

then a task can use ansible_loop_var to get the "outer_item" name in an included task.

what is the control flow look like for wu.yml?

ybucci commented 2 years ago

Hi @fosterseth

Another example with this bug whitout any loop var

image

---

- name: Stop wuauserv service
  win_service:
    name: wuauserv
    state: stopped

- name: Remove Old WU Registry
  win_regedit:
    path: HKLM:\SOFTWARE\Policies\Microsoft\Windows\WindowsUpdate\AU
    state: absent

- name: Remove Old Dir SoftwareDistribution
  win_file:
    path: C:\Windows\SoftwareDistribution
    state: absent

- name: Start wuauserv service
  win_service:
    name: wuauserv
    state: started
No config file found; using defaults

PLAY [servers_hml] *************************************************************

TASK [Gathering Facts] *********************************************************
ok: [OMITED]

TASK [bootstrap : Stop wuauserv service] ***************************************
ok: [OMITED] => {"can_pause_and_continue": false, "changed": false, "depended_by": [], "dependencies": ["rpcss"], "description": "Ativa a detecção, download e instalação de atualizações do Windows e de outros programas. Se o serviço for desativado, os usuários do computador não poderão usar o Windows Update ou o recurso de atualização automática, e os programas não poderão usar a API do Windows Update Agent (WUA).", "desktop_interact": false, "display_name": "Windows Update", "exists": true, "name": "wuauserv", "path": "C:\\\\Windows\\\\system32\\\\svchost.exe -k netsvcs", "start_mode": "delayed", "state": "stopped", "username": "LocalSystem"}

TASK [bootstrap : Remove Old WU Registry] **************************************
changed: [OMITED] => {"changed": true, "data_changed": false, "data_type_changed": false}

TASK [bootstrap : Remove Old Dir SoftwareDistribution] *************************
changed: [OMITED] => {"changed": true}

TASK [bootstrap : Start wuauserv service] **************************************
changed: [OMITED] => {"can_pause_and_continue": false, "changed": true, "depended_by": [], "dependencies": ["rpcss"], "description": "Ativa a detecção, download e instalação de atualizações do Windows e de outros programas. Se o serviço for desativado, os usuários do computador não poderão usar o Windows Update ou o recurso de atualização automática, e os programas não poderão usar a API do Windows Update Agent (WUA).", "desktop_interact": false, "display_name": "Windows Update", "exists": true, "name": "wuauserv", "path": "C:\\\\Windows\\\\system32\\\\svchost.exe -k netsvcs", "start_mode": "delayed", "state": "running", "username": "LocalSystem"}

PLAY [servers_hml] *************************************************************

TASK [Gathering Facts] *********************************************************
ok: [OMITED]

TASK [bootstrap : Stop wuauserv service] ***************************************
changed: [OMITED] => {"can_pause_and_continue": false, "changed": true, "depended_by": [], "dependencies": ["rpcss"], "description": "Ativa a detecção, download e instalação de atualizações do Windows e de outros programas. Se o serviço for desativado, os usuários do computador não poderão usar o Windows Update ou o recurso de atualização automática, e os programas não poderão usar a API do Windows Update Agent (WUA).", "desktop_interact": false, "display_name": "Windows Update", "exists": true, "name": "wuauserv", "path": "C:\\\\Windows\\\\system32\\\\svchost.exe -k netsvcs", "start_mode": "delayed", "state": "stopped", "username": "LocalSystem"}

TASK [bootstrap : Remove Old WU Registry] **************************************
changed: [OMITED] => {"changed": true, "data_changed": false, "data_type_changed": false}

TASK [bootstrap : Remove Old Dir SoftwareDistribution] *************************
changed: [OMITED] => {"changed": true}

TASK [bootstrap : Start wuauserv service] **************************************
changed: [OMITED] => {"can_pause_and_continue": false, "changed": true, "depended_by": [], "dependencies": ["rpcss"], "description": "Ativa a detecção, download e instalação de atualizações do Windows e de outros programas. Se o serviço for desativado, os usuários do computador não poderão usar o Windows Update ou o recurso de atualização automática, e os programas não poderão usar a API do Windows Update Agent (WUA).", "desktop_interact": false, "display_name": "Windows Update", "exists": true, "name": "wuauserv", "path": "C:\\\\Windows\\\\system32\\\\svchost.exe -k netsvcs", "start_mode": "delayed", "state": "running", "username": "LocalSystem"}

PLAY RECAP *********************************************************************
OMITED    : ok=5    changed=4    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
OMITED  : ok=5    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

Children Summary:

{"children_summary":{"2":{"rowNumber":1,"numChildren":33},"3":{"rowNumber":2,"numChildren":27},"4":{"rowNumber":3,"numChildren":2},"7":{"rowNumber":6,"numChildren":5},"10":{"rowNumber":9,"numChildren":5},"13":{"rowNumber":12,"numChildren":5},"16":{"rowNumber":15,"numChildren":5},"19":{"rowNumber":18,"numChildren":3},"20":{"rowNumber":19,"numChildren":2}},"meta_event_nested_uuid":{},"event_processing_finished":true}
fosterseth commented 2 years ago

I see that the tasks in the role ran twice, is there a loop or double import in the playbook that incorporates that role?

ybucci commented 2 years ago

@fosterseth

It's because whe using this strategy

- hosts: "{{ hostnames }}"
  strategy: host_pinned
  serial: 1
  roles: 
    - { role: wu, tags: wu }
fosterseth commented 2 years ago

@ybucci

Thanks for your response, that was a big help. I am now able to reproduce the missing events by adding strategy: host_pinned and serial: 1. I'll look into getting a fix in soon

keithjgrant commented 2 years ago

This should be fixed now with #12194 and #12287. If anyone is still seeing this issue, comment here, otherwise we can close this issue