ansible / awx

AWX provides a web-based user interface, REST API, and task engine built on top of Ansible. It is one of the upstream projects for Red Hat Ansible Automation Platform.
Other
13.97k stars 3.41k forks source link

Sliced Job Template creates number of jobs as per the slicing count for the limited host #2893

Open Ompragash opened 5 years ago

Ompragash commented 5 years ago
ISSUE TYPE
COMPONENT NAME
SUMMARY

SJT creates number of jobs as per the slicing count for the limited hosts.

ENVIRONMENT
STEPS TO REPRODUCE

Create a Inventory with multiple hosts Create a SJT with multiple slices and select the above created Inventory Now, Limit the SJT for one of the hosts from the provided Inventory Launch the SJT

EXPECTED RESULTS

Only one job is created for the limited hosts even if the Job Slicing value is >1.

ACTUAL RESULTS

Multiple jobs are created for the limited hosts as per the Job Slicing count.

ADDITIONAL INFORMATION

Even if multiple jobs are created, only one succeeds rest everything fails with ERROR! Specified hosts and/or --limit does not match any hosts. sjt-2 sjt-3

sjt-1

donateur commented 5 years ago

This is very similar to a problem I've raised with Red Hat support on behalf of my client - although I'm not sure what SJT is. We find that when re-doing failed hosts there may be fewer hosts than the number of job slices.

I suggested that AWX be modified to either:

  1. Only open as many slices as there are hosts (up to the total number of slices/instances) OR
  2. Ensure that instances with no hosts are marked as successful - nothing to do. It is confusing/wrong that they are marked as failed.
domq commented 4 years ago

SJT might mean Sliced Job Template.

domq commented 4 years ago

The reason this happens is that ansible-playbook doesn't like being told to run for zero hosts; ansible-runner doesn't detect that situation (and is unwilling to change that behavior) and passes the failure upwards.

Possible approaches for a fix include

gforster commented 4 years ago

Another possible idea is to allow the number to be selected with "prompt on launch." Of course that would only help with known quantities.

donateur commented 4 years ago

Another possible idea is to allow the number to be selected with "prompt on launch." Of course that would only help with known quantities.

It would also require unnecessary manual action on behalf of the user.

EDIT and as to say if a user just clicks to re run on failed hosts they may not even be aware of how many there are.

gforster commented 4 years ago

Sure, I'm thinking in the case where you might normally want it split across 3 nodes, but then need to override for 1. Seems silly to have separate workflows to control that single variable. or change/saving each time. Or in the middle of a workflow where you know you only want it on less than the normal. Not the full solution for sure, but would be handy.

kdelee commented 1 year ago

Given that https://github.com/ansible/ansible/pull/76438 was rejected, and Controller as it is does not really have a way of knowing how many hosts may match a filter -- AFAIK the filter is passed to ansible and doesn't apply until runtime of the job -- controller makes its decision about how many slices to create BEFORE the filter is applied. The only thing I think we could possibly do since we've not been able to land fixes in ansible and runner is do some kind of preliminary "apply the filter to the inventory and see how many match" step.

This would almost be like a inventory update before the sliced job with a limit spawns its slices.

I'm thinking: 1) a sliced job is launched with a limit applied, so we create it with dependencies_processed=false. Not sure on details here aboue WHEN it becomes a workflow job. But if it is a workflow job from the get-go, it will be new for workflow jobs to have dependencies_processed=false. 2) We launch some kind of inventory update like process that does the thing to find out how many hosts the limit will cut the inventory down to, save this info as number of slices to spawn on the workflow/at this point decide what the workflow nodes will be, set dependencies_processed=true 4) proceed as we do today, now the workflow is ready to run

I'm sure we could do something more elegant, but hacky way to do the inventory like update now might be approximated by what I can do on the CLI:

given an inventory file named hosts

[mygroup]
testhost[:100]

[foogroup]
matchinghost

This inventory has 103 hosts. But if I run

ansible -i hosts all --list-hosts --limit matchinghost

I get the output:

  hosts (1):
    matchinghost

Which tells me of my inventory with 103 hosts, only 1 matches the limit

aleksandar-kinanov commented 3 months ago

Are there any updates on this topic?

There's no way to pass number of slices through a Workflow Template and all Job Templates must be pre-configured - this is an issue when you are executing against big number of hosts by default, but want to run against a smaller batch that is smaller or equal/close to the number of slices. In order to workaround it, you need to reconfigure all your Job Templates that are part of the workflow template....