HA and node addition requirement

hortonworks / ansible-hortonworks

Ansible playbooks for deploying Hortonworks Data Platform and DataFlow using Ambari Blueprints

Apache License 2.0

248 stars 253 forks source link

HA and node addition requirement #151

Closed abh23 closed 5 years ago

abh23 commented 5 years ago

Hello @alexandruanghel,

I have been working on a feature where we need a feature to 'add a new node to existing HA cluster' in case 1 of the master node is faulty or unhealthy. Please spare if I sound a bit illogical as I am fairly very new to BIG data space.

I am attaching a sample blueprint in which I am attempting to create 1 host-group for master node which will have same components installed so that in case of scaling UP or adding new node of master type, I will have to just make sure I am adding it to 'masters' host_group. sample_blueprint.txt

I have made some progress by changing few items in set_variables.yml but need to check if this approach of creating a blueprint with just 2 host-groups is correct or not? Will the ambari be able to setup HA in case of 2 masters?

Please feel free to ask for more details as at this point I dont want to add a lot of info since I am unsure what will be relevant. Looking forward for your response.

alexandruanghel commented 5 years ago

Hi @abh23,

Not sure what's the end goal here or how you'll be able to implement this "dynamic" masters group. All of the articles I've seen do this only for worker nodes, not master nodes: https://community.hortonworks.com/articles/1333/dynamically-adding-hosts-to-an-existing-cluster-wi.html

From a blueprint perspective, check https://github.com/hortonworks/ansible-hortonworks/issues/44 and https://github.com/hortonworks/ansible-hortonworks/blob/master/playbooks/roles/ambari-blueprint/templates/blueprint_dynamic.j2#L510 I don't believe it will work, since those blueprint variables expect at least 2 host_groups. Then you'll need at least 3 masters for things like journalnode and zookeeper.

But back again to your problem, in an HA scenario, if you loose 1 namenode, you'd still have the other one left and take manual remediation actions (which could be automated via Ambari API, but not blueprint). And in HDP3 I believe you can have 3 namenodes (although I haven't tested myself), so the risk of loosing 1 is even lower.

abh23 commented 5 years ago

Hey @alexandruanghel, thanks for a timely response.

The end goal here is to have just 2 host-groups (master and worker) and then spawn the number of instances in each host-group as required. For HA, currently we are testing with 2 masters and 1 worker which surely isn't our deployment or production setup.

I am attempting to make sure all the groups are populated as they were earlier. For that I have already made good changes in set_variable.yml, etc. I can even generate blueprint template as required for the cluster formation. Although, cluster creation template is still WIP.

The whole idea that I am trying to implement here is to have absolutely identical nodes in each host-group so that in case of scaling UP the master or worker, no additional handling is required. Obviously, it will have its own set of problems but won't it make sense if we have identical nodes?

In the blueprint, we can have multiple host-groups with 'instance-name' in which FQDN can also refer to the same instance name. Blueprint may not be easy to read but will do the task is it supposed to do.

Your thoughts on it please.

alexandruanghel commented 5 years ago

Unfortunately it's not that simple :) Most of the master services, especially HDFS NameNode, YARN ResourceManager do not allow more than 2 host-groups and 3 in HDP3. Check this config required by HA NameNode: https://github.com/hortonworks/ansible-hortonworks/blob/master/playbooks/roles/ambari-blueprint/templates/blueprint_dynamic.j2#L510 That's based on 2 different host-groups, each with 1 single node.

And see this for a complete example of possibilities: https://github.com/hortonworks/ansible-hortonworks/blob/master/playbooks/group_vars/example-hdp-ha-3-masters-with-nifi-kafka-druid You'll need to have 3 separate host-groups, each with 1 single node - these will be your masters, you cannot scale them. And then you have the hdp-worker and hdp-stream host-groups. These you can scale as much as you want.

On the identical nodes you mention, I didn't understand what you meant. If you mean identical hardware, then sure, make sense to use identical hardware. But not sure what you want to do with the FQDN, all hosts in the cluster have to be unique, you can't have the same FQDN in 2 separate host-groups.

abh23 commented 5 years ago

Sorry for late reply.

By identical nodes, I meant all masters having the same list of components/services that are to be installed. Also, same goes for worker nodes as well.

Imagine a use case where 1 of the master is DOWN so the current state of the cluster becomes unhealthy as far as HA is concerned. Now, at this time, I want to add a new node of MASTER type in the cluster which should join the existing active master and form HA. Of course, there will be re-configuration requirements but, the cluster can be brought to HEALTHY state.

alexandruanghel commented 5 years ago

No worries.

Ok, I understand what you're trying to do, but I don't believe it will be possible for the master nodes, only for worker nodes. As I said, it's not possible to define 1 master host-group with 2+ nodes in it. Even if you have identical services, you'd need 2-3 host-groups.

Then, let's say you do this, you have 3 different host-groups for masters and 1 for workers. 1 master fails. You could theoretically use Ambari API calls to add services. And you could use blueprints to add hosts to a cluster: https://community.hortonworks.com/articles/1333/dynamically-adding-hosts-to-an-existing-cluster-wi.html

But outside that simple example with kafka, I'm not confident it will work with core master services like namenode and resourcemanager.

alexandruanghel commented 5 years ago

Closing due to inactivity.