riptano / ComboAMI

The AMI takes a set of input parameters via the EC2 user-data to install, RAID, ring, and launch a DataStax Enterprise/Community cluster.
69 stars 59 forks source link

VPC cluster that spans availability zones #23

Closed StrongPa55word closed 10 years ago

StrongPa55word commented 11 years ago

I have a VPC network which uses 3 AZ's .The DataStax AMI doesn't span across availability zones. Its using SimpleSnitch . Any plans to add EC2Snitch and EC2MultiRegionSnitch

http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/architecture/architectureSnitchesAbout_c.html#concept_ds_mj2_3qf_fk

StrongPa55word commented 11 years ago

I did see Ec2Snitch is enabled. However its not working

joaquincasares commented 11 years ago

Hey @titotp ,

I'm not sure exactly what's going on, but I'll explain a bit more on what you may be seeing and hopefully that clears up the issue.

I'm assuming your cluster is visible as 3 separate clusters and not one continuous one? If so, that has nothing to do with the snitches. The snitches allow the cluster to see where the nodes fall in terms of the data distribution by placing nodes in different datacenters, and sometimes racks. If you're seeing 3 separate clusters that can't talk to each other, that may just be a routing situation and I'm not sure where to begin with VPC support on that end. For normal EC2 usage all nodes know of seeds (composed of a few nodes' internal IP addresses) and all nodes communicate with each other via those internal IP addresses that are setup as the listen_address in the cassandra.yaml. If you need a different setup, you'll need to set the broadcast_address to be the external address, the listen_address remain the internal address, and the seeds will be composed of the broadcast_addresses (external IP addresses).

However, if your cluster is showing up with all the nodes visible under nodetool ring or nodetool status but they all show the same datacenter, then it would be a snitch issue.

SimpleSnitch will place all nodes into DC1, or something similar, while EC2Snitch will place nodes in DC's according to EC2 Regions and racks according to EC2 AZ's. (However, before you go down that route, I'd highly recommend using separate datacenters and single rack. It makes setup, maintenance, and emergency situations much easier. Read up a bit more on the very last caveat, if you're thinking about this setup: http://www.datastax.com/dev/blog/multi-datacenter-replication)

Let me know if I misunderstood the problem with feedback on where exactly I did.

Hope this helps!

joaquincasares commented 10 years ago

Hello I'm closing out old issues and just realized you were may have been expecting cross-AZ's launches to work. The reason they won't work is due to the fact that EC2 reservations are in a single AZ. What I typed above was a manual solution for getting things setup after using the AMI to launch 3 separate clusters.

Cheers.