samdoran commented 6 years ago

Proposal: Improve `loop` syntax

Author: Sam Doran <@samdoran>

Date: 2018-02-19

Status: New
Proposal type: core design
Targeted Release: 2.7
Associated PR: None
Estimated time to implement: unknown

Motivation

Loops in Ansible have always been a combination of with_ + [lookup plugin name]. This convention hid the fact that loops were in fact lookup plugins. This "magic" feature of Ansible often confused users.

Moving to a new loop keyword is a step in the right direction of making it more apparent that loops in Ansible use lookup plugins as well as removing some of the "magic" behavior that with_* performed behind the scenes. However, the resulting syntax for all but lists (with_items) and lists of dictionaries is much harder to express in a playbook with the lookup plugin syntax. Experienced Ansible users will mostly likely be very thrown off by the new syntax because it feels more like programming and less like writing playbooks.

Problems

Here are some examples of using pure lookups with the new loop keyword. These are much harder to read, write, and maintain in a playbook that is mostly multi-line YAML. This syntax greatly increases the likelihood for syntax errors, e.g., missing a quote, bracket, or parenthesis. This is particularly true for non-programmers, a significant part of the Ansible community.

Furthermore, wantlist=True or the q() shortcut is needed by most lookups to ensure a list is returned. Although wantlist=True has been around for some time and q() is much better than lookup('foo', ..., wantlist=True), this will bewilder and frustrate users who will most likely only see it as visual noise in their playbooks even though it has legitimate underlying technical reasons for existing, and in fact makes it much more apparent what is actually going on, i.e., less "magic".

Some examples of existing loops using the loop keyword.

`items`

These are fine and don't need to change.

- name: List
  debug:
    msg: "{{ item }}"
  loop:
    - one
    - two
    - three

- name: List of dictionaries
  debug:
    msg: "Name: {{ item.name }}, Path: {{ item.path }}"
  loop:
    - name: Bob
      path: /home/Bob

    - name: Jill
      path: /home/jill

`fileglob`

- name: Fileglob
  debug:
    msg: 'Playbook: {{ item }}'
  loop: "{{ lookup('fileglob', './e*.yml', wantlist=True) }}"

`nested`

This gets really hard to read if you have long lists.

- name: Nested
  debug:
    msg: "{{ item }}"
  loop: "{{ lookup('nested', ['one', 'two'], ['1', '2']), wantlist=True }}"

`first_found`

This is one of the most problematic to cleanly translate to the new syntax, especially if there are a lot of items listed and the skip and paths options are used or if the paths themselves use Jinja variables.

- name: First found
  debug:
    msg: "{{ item }}"
  loop: "{{ lookup('first_found', [playbook_dir + '/files/something.txt', playbook_dir + '/files/something.txt', 'example.txt'], wantlist=True ) }}"

While it is possible to put the long argument parameters inside vars at the task level, this is not immediately apparent and a rather unknown capability. Maybe we could fix this with docs, but it will still be confusing.

Solution proposal

Create a loop syntax that allows specifying the parts of the lookup using YAML rather than requiring rather complex lookup expressions. The "pure" form can still exist for those comfortable with writing long Jinja expressions, but breaking out the arguments separately in the playbook provides a friendlier interface.

The loop field always contains a list or a Jinja expression that evaluates to a list. The loop_control field contains the lookup plugin name(query or lookup) as well as any arguments a particular lookup plugin may accept.

Ideally these would live inside the loop field, but that would most likely be harder to pull off and greatly overload that field.

`fileglob`

- name: Fileglob
  debug:
    msg: 'Playbook: {{ item }}'
  loop: './e*.yml'
  loop_control:
    query: fileglob

`nested`

- name: Nested
  debug:
    msg: "{{ item }}"
  loop:
    - ['one', 'two']
    - ['1', '2']
  loop_control:
    query: nested

`first_found`

- name: First found
  debug: "{{ item }}"
  loop:
    - files:
          - "{{ playbook_dir }}/files/something.txt"
          - "{{ playbook_dir }}/files/something.txt"
      skip: yes
  loop_control:
    query: first_found

Dependencies

None

Testing

Tests for the new syntax need to be written. Tests for each lookup plugin may need to be written, depending on the current level of coverage.

Documentation

A porting guide from with_* to loop for several several of the most common loop types needs to be written (this is already done for with_items). Also, a link to the various lookup plugins and their options from the loops page is also needed to avoid making it look like many of the looping mechanisms in Ansible disappeared.

Closing thoughts

Relying on a complex Jinja expression in the playbook as the primary interface for loops feels like progress in the area of transparency, but regression in the areas of simplicity and usability. One of Ansible's greatest strengths is its simple playbook syntax. While you have to know Jinja to use Ansible in any meaningful capacity, this feels like a bridge too far.

While this may be moving some of the current magic to a new place, I feel that would be worthwhile if it made the user better understand how to use lookups for loops, how to pass arguments to lookups, and kept playbooks easy to read, write, and maintain.

dagwieers commented 6 years ago

@samdoran I agree something needs to be done, and had a similar solution to yours. Especially since using the lookup-mechanism tends to lead to nested quoting, however the with_ construct mechanism was fairly constrained wrt. providing options to a lookup plugin (as a result a lot of plugins abuse the provided data to influence the plugin's working).

I would take this one step further, and simply get rid of lookup plugins (over time) and the need for lookup() or q(). Instead of lookup plugins they would be just normal Jinja2 filters (or filter_plugins) and can be called directly, there's no real need for the existing indirection.

And in that case the keyword query should probably become filter since it would be both manipulating provided data as well as returning information (like every Jinja filter).

bcoca commented 6 years ago

As for the formatting, it is not required to be done the way shown above, with loop you can actually already do very similar use of the existing:

- name: Nested
  debug:
    msg: "{{ item }}"
  loop: "{{ lookup('nested', ['one', 'two'], ['1', '2']), wantlist=True }}"

can be written:

- name: Nested
  debug:
    msg: "{{ item }}"
  loop: "{{ ['one', 'two']|product(['1', '2'])|list }}"

and the first_found is not really a loop, its 'used as a loop' but really is only one item, in any case this:

- name: First found
  debug:
    msg: "{{ item }}"
  loop: "{{ lookup('first_found', [playbook_dir + '/files/something.txt', playbook_dir + '/files/something.txt', 'example.txt'], wantlist=True ) }}"

can be written as:

- name: First found
  debug:
    msg: "{{ item }}"
  loop: "{{  lookup('first_found', findme, wantlist=True) }}"
   vars: 
     findme: 
         - "{{playbook_dir + '/files/something.txt'}}"
         - "{{playbook_dir + '/files/something.txt'}}"
         - 'example.txt'

but really, this does not need to go through a loop at all, this is one good thing about decoupling the plugins from with_:

- name: First found
  debug:
    msg: "{{  lookup('first_found', findme) }}"
   vars: 
     findme: 
         - "{{playbook_dir + '/files/something.txt'}}"
         - "{{playbook_dir + '/files/something.txt'}}"
         - 'example.txt'

@dagwieers I would keep lookup plugins, but remove the ones that overlap with filters.

i.e nested == |product, items == |flatten(1), flatten == |flatten, but password, fileglob and others are actual 'look ups', these don't work well as filters so i would keep the 'class', otherwise they will be come 'tasks delegated to localhost'.

ahuffman commented 6 years ago

I completely agree @samdoran, and like your proposal. The one thing I might change is query to type on loopcontrol, but either is fine with me. The fact that with* is being replaced is fine with me, but moving to pure jinja lookups loses a great deal of readability. Human readability is supposed to be something Ansible touts as a selling point, and this definitely makes reading playbooks far more complex to the uninitiated, as I tend to loop as much as possible to reduce tasks.

ansible / proposals

Improve loop UI #99

Proposal: Improve `loop` syntax

Motivation

Problems

`items`

`fileglob`

`nested`

`first_found`

Solution proposal

`fileglob`

`nested`

`first_found`

Dependencies

Testing

Documentation

Closing thoughts

ansible / proposals

Improve loop UI #99

Proposal: Improve loop syntax

Motivation

Problems

items

fileglob

nested

first_found

Solution proposal

fileglob

nested

first_found

Dependencies

Testing

Documentation

Closing thoughts

Proposal: Improve `loop` syntax

`items`

`fileglob`

`nested`

`first_found`

`fileglob`

`nested`

`first_found`