hortonworks / ansible-hortonworks

Ansible playbooks for deploying Hortonworks Data Platform and DataFlow using Ambari Blueprints
Apache License 2.0
248 stars 253 forks source link

[Proposal] split up the `set_variables.yml` monolith #169

Open lhoss opened 5 years ago

lhoss commented 5 years ago

Motivations

Analysis

Now the set_variables.yml file is already split into 3 host sections, on which it can be naturally split up:

Implementation Ideas

Next I want to go more into detail, on ideas for each part, to improve some of the listed issues above

P1

THIS part needs to be run for all roles (except maybe 'common', because it configures the important (often used) ambari-server group IDEAs:

P2

This part contains the most set_fact tasks, and thus simplifying/condensing it could fix most of the mentioned slowness, verbosity and duplication issue of this tasks file.

IDEA: (Instead of copy&pasting the same logic over&over) use some simple custom ansible filters (written in a few lines of python). We might need upto 2 * 2 filters:

To illustrate how the filter API would look like, an example for the zk vars:

    - name: Initialize the control variables
      set_fact:
        zookeeper_groups:  blueprint_dynamic | get_groups([ZOOKEEPER_SERVER] )
        zookeeper_hosts: []
        hdf_hosts: blueprint_dynamic | get_hosts([NIFI_MASTER,STREAMLINE_SERVER,REGISTRY_SERVER] )

I see only 1 special case, with 1 extra condition, that can be handled by sep. task (or maybe better, move the extra checks like 'database==embedded' directly into the blueprint

Advantages:

P2+P3 vars

Following vars are used in roles: ansible-config,blueprint (and 1 occurrence in common,post-install) install_hdp install_hdf install_hdpsearch IDEAs (WIP):

P3 ansible_python_interpreter

Question: Though I agree it's nice to prefer python3 (if found on the server) to the older v2, but isn't it better to have this part configured through ansible.cfg, and let the user control it, and this way also allow to use the right ansible feature for this: https://docs.ansible.com/ansible/latest/reference_appendices/interpreter_discovery.html

Next Steps

Unless I learn about some unforeseen blockers to above ideas, I plan to try out refactorings in following order:

lhoss commented 5 years ago

POC for the most promising part (P2), can reviewed here: https://github.com/scigility/ansible-hortonworks/pull/2 ps: The change is already successfully used at a customer (where I could remove also P2/b, as only the 'blueprint_dynamic' method is required)

This is one of the best uses of (custom) 'Ansible Filters' I've done myself 👍 Wdyt @alexandruanghel , @agriffaut , @zer0glitch ?!