TOSIT-IO / tdp-collection-prerequisites

Ansible collection with TDP prerequisites
Apache License 2.0
3 stars 10 forks source link

Ansible TDP Collection Prerequisites

Supported distribution


The topology.ini file includes Ansible groups to work with playbooks inside this collection which can be added to your Ansible inventory configuration.


ansible-playbook playbooks/all.yml

This playbook deploys the following services: Chrony, a CA, a LDAP, a KDC, a PostgreSQL.

HTTP proxy configuration

You can configure HTTP proxy with Ansible variables:

System configuration

Populate /etc/hosts file for [all] Ansible hosts. You need to specify the IP used with the ip Ansible variable for each host, same for the domain.

Configure yum/dnf global http proxy and install needed yum/dnf packages.

Configure JAVA_HOME variable inside /etc/environment.

Disable firewalld.

Install needed Python packages with pip3 for PySpark.

ansible-playbook playbooks/system.yml

Chrony (NTP)

Configure Chrony as NTP client.

ansible-playbook playbooks/chrony.yml

You can configure NTP servers with ntp_servers Ansible variable.

Certificate Authority and Certificates

Creates a certificate authority at the [ca_host] Ansible group and distributes signed certificates and keys to each VM.

ansible-playbook playbooks/certificates.yml

_The certificates will also be downloaded to the roles/certificates/files/tdp_getting_started_certs local project folder._

LDAP and Kerberos

Launches a LDAP server and KDC on the [kdc] group hosts. Launches a kdcproxy on the [kdcproxy] group hosts.

On each host installs Kerberos clients and enable SSSD LDAP authentification.

Create ldap_groups and ldap_users specified. For each users, a Kerberos principal is created and a keytab is generated in /home/<username>/<username>.keytab on the[users_keytab] group hosts.

You can configure ticket renewal lifetime:

Note that some tools may malfunction if kerberos_renew_lifetime > kerberos_max_renewable_life.

ansible-playbook playbooks/ldap_kerberos.yml

_After this, you can log in as the Kerberos admin from any VM with the command kinit admin/admin and the password admin123. When using a user unix account on [users_keytab] group hosts, you can log in with kinit -ki which will use the keytab inside the home directory._

A krb5 configuration file, parametrized for HTTPS can be found at /etc/krb5-https.conf on every hosts. To authenticate through https and see logs:

env KRB5_TRACE=/dev/stdout KRB5_CONFIG=/etc/krb5-https.conf kinit <your_user>


Launches a PostgreSQL server on the [postgresql] group hosts.

Public repository

By default, PostgreSQL PGDG public repositories are configured, you can disable it with postgresql_use_public_repo: no and configure your own mirror for these repositories.

Server configuration

You can configure listen adresses, port and password encryption with postgresql_listen_addresses, postgresql_port, and postgresql_password_encryption.

Other settings can be set with postgresql_additional_settings.

Generated settings, users, databases, schemas and pg_hba entries

Needed settings, users, databases, schemas and pg_hba entries are automatically created by reading Ansible groups. You can disable it by setting an empty array or your custom needs with these variables:

If you want to add your custom needs on top of generated ones use these variables:

See roles/postgresql/defaults/main.yml.

The Ansible groups read are:

You can configure default values for user, password and database with:

For pg_hba entries, the address can be specified and default to all, Ansible will read the hostvars ip to get it. You can configure the hostvars key read with postgresql_generated_hba_address_hostvars_key.

With inventory example below, 2 pg_hba entries will be generated and fill with ip value.

master-02 ip=
master-03 ip=


pg_hba.conf generated:

# TYPE  DATABASE        USER            ADDRESS                 METHOD
host    hive            hive          md5
host    hive            hive          md5

If you have a custom variable for the address, you can set it with postgresql_generated_hba_address_hostvars_key: ip_data with inventory example below.

master-02 ip= ip_data=
master-03 ip= ip_data=


pg_hba.conf generated:

# TYPE  DATABASE        USER            ADDRESS                 METHOD
host    hive            hive           md5
host    hive            hive           md5