cloudera-labs / cloudera-deploy

A general purpose framework for automating Cloudera Products
Apache License 2.0
63 stars 59 forks source link

[Feature Req] Tool/Script to generate "cluster.yml" configs from existing exported CDP cluster template json files #39

Open lhoss opened 3 years ago

lhoss commented 3 years ago

Not sure I'm missing a simpler strategy, on howto prepare a new "cluster.yml" file required by this playbook (to be configured in the definition_path), from an exportable CDP (7.1.x / private-cloud) cluster ? Would be great to get input from Cloudera guys :) Alternatively, or in addition, it would be highly useful/helpful if much more advanced "cluster.yml" example would be added to the repo (for ex. to deploy a 3-master node HA cluster, as this was nicely done in the former HDP repo: https://github.com/hortonworks/ansible-hortonworks/blob/master/playbooks/group_vars/example-hdp-ha-3-masters-with-ranger-atlas )

Once I learn here in the community, there's indeed no such existing script .. I'm happy to write&contribute myself something

Minimal features:

Script Input / Output examples

Output file, following the format of cluster.yml, for ex: roles/cloudera_deploy/defaults/basic_cluster.yml

clusters:
  - name: Basic Cluster
    services: [HDFS, YARN, ZOOKEEPER]
    repositories:
      - https://archive.cloudera.com/cdh7/7.1.4.0/parcels/
    configs:
      ZOOKEEPER:
        SERVICEWIDE:
          zookeeper_datadir_autocreate: true
          zk_server_log_dir": "/var/log/zookeeper"

      HDFS:
        DATANODE:
          dfs_data_dir_list: /dfs/dn
        NAMENODE:
          dfs_name_dir_list: /dfs/nn
...
    host_templates:
      Master1:
        HDFS: [NAMENODE, SECONDARYNAMENODE]
        YARN: [RESOURCEMANAGER, JOBHISTORY]
        ZOOKEEPER: [SERVER]
      Workers:
        HDFS: [DATANODE]
        YARN: [NODEMANAGER]
tmgstevens commented 3 years ago

I definitely agree this is a useful thing to have. In theory it shouldn't be too hard to do a naive mapping from an exported cluster template into a cluster.yml, however the things that might be tricky are:

But if you fancy a stab I'd be happy to review and contribute. It could probably work quite nicely in j2 and then it's potentially possible to operate a "clone-a-cluster" type feature if we're already integrated with Ansible?