Ansible collection to deploy the components of TDP

Available Roles

Getting started

The best to get started with TDP and the Ansible roles is to go through the Getting Started repository.

Install the collection

Ansible 2.9

Ansible 2.9 does not handle installing a collection from a Git repository with ansible-galaxy. Instead, clone the repository in the correct folder.

For example, set the property collections_paths in your ansible.cfg:


Then create the folders structures and clone:

mkdir -p collections/ansible_collections/tosit
git clone collections/ansible_collections/tosit/tdp

The project structure should look like this:

├── ansible.cfg
├── collections
│   └── ansible_collections
│       └── tosit
│           └── tdp
│               ├── galaxy.yml
│               ├──
│               └── roles
│                   ├── hadoop
│                   ├── hive
│                   ├── ranger
│                   ├── spark
│                   ├── ...
│                   └── zookeeper
├── roles
├── test.yml

Note that the first role folder is not the roles from this collection, but any other roles the project has. The collections folder has been set in ansible.cfg.

Mitogen 0.2

The collection is compatible with Mitogen 0.2.

In order to activate Mitogen, follow the Mitogen installation guide.

Note: We use custom plugins which are incompatible with Mitogen. For this reason, we added strategy: linear in some of our playbooks (e.g.: hbase_hdfs_init.yml) to avoid any issues with Mitogen configured Ansible environments.

Ansible 2.10

Plugins and modules

Example usage:

- name: Add directory for spark logs
  delegate_to: "{{ groups['hdfs_nn'][0] }}"
    hdfs_conf: "{{ hadoop_conf_dir }}"
    path: "{{ item.path }}"
    state: "{{ item.state | default(omit) }}"
    owner: "{{ item.owner | default(omit) }}"
    group: "{{ | default(omit) }}"
    mode: "{{ item.mode | default(omit) }}"
  become: yes
  become_user: "{{ hdfs_user }}"
    - path: /spark-logs
      state: directory
      owner: "{{ spark_user }}"
      group: "{{ hadoop_group }}"
      mode: '777'

Example usage:

- debug:
    msg: "{{ groups[hdfs_nn][0] | access_fqdn(hostvars) }}"

- debug:
    msg: "{{ groups['hdfs_jn'] | map('access_fqdn', hostvars) | list }}"

Use a role from the collection

The best way to use the roles from the collection is to call the related file from the playbooks directory inside another playbook.


- name: Deploy ZooKeeper
  ansible.builtin.import_playbook: ansible_roles/collections/ansible_collections/tosit/tdp/playbooks/zookeeper.yml

- name: Deploy Hadoop
  ansible.builtin.import_playbook: ansible_roles/collections/ansible_collections/tosit/tdp/playbooks/hadoop.yml

- name: Deploy Hive
  ansible.builtin.import_playbook: ansible_roles/collections/ansible_collections/tosit/tdp/playbooks/hive.yml


Dev dependencies

Please follow the guidelines at contributing and respect the code of conduct.