zalando-stups / stups-etcd-cluster

Etcd cluster appliance for the STUPS (AWS) environment
Other
29 stars 9 forks source link

More than one etcd cluster per AWS account #13

Closed regispl closed 8 years ago

regispl commented 8 years ago

More a question than an issue, at least for now... Is my understanding correct that you can't have more than 1 etcd cluster per AWS account by default?

I'm looking at the code here and it seems that Route53 entries are sort of hardcoded in etcd.py which means that every deployment of a new cluster will overwrite entries for old ones? They're parametrised, but parameters are stack version and hosted zone, but not with a stack name.

    record_name = '_etcd-server._tcp.{}.{}'.format(stack_version, self.hosted_zone)
    (...)
    record_name = '_etcd._tcp.{}.{}'.format(stack_version, self.hosted_zone)
    (...)
    self.update_record(conn, zone_id, 'A', 'etcd-server.{}.{}'.format(stack_version, self.hosted_zone), new_record)

Is my understanding correct? Is that intended? What if I want to have 2-3 completely separate stacks per AWS account (e.g. dev, staging, prod)? Workaround seems to be using non-clashing versions even for different stack names, but it seems to be error prone...

CyberDem0n commented 8 years ago

Hi! Etcd is not stateless application and you can't really run multiple version of it in parallel. So we decided to use stack_version as the cluster name. It means you just need to deploy it multiple times with specifying different stack_versions, i.e.: dev, staging, prod

For each separate stack version it will create their own records in Route53

regispl commented 8 years ago

Thanks for quick response! :-)

Etcd is not stateless application and you can't really run multiple version of it in parallel.

I'm not sure I'm following why being stateful makes a difference here - why I can't run it in the same way I can run multiple ZooKeepers (Exhibitors) in parallel? E.g. one with stack name "dev-exhibitor" and "prod-exhibitor" and version them independently?

In other words - what could go wrong if Route53 entries were created e.g. like this:

'_etcd-server._tcp.{}.{}'.format(self.manager.me.cluster_token, self.hosted_zone)

rather than this:

'_etcd-server._tcp.{}.{}'.format(stack_version, self.hosted_zone)

so the the entry looks like this: _etcd-server._tcp.dev-etcd-1.hosted.zone

It means you just need to deploy it multiple times with specifying different stack_versions, i.e.: dev, staging, prod

Hmm... Is that a known "pattern" for STUPS appliances? I always thought that StackName is meant to separate independent deployments while version is "per stack" thing. Is my understanding wrong?

CyberDem0n commented 8 years ago

What I don't really understand, is why do you want to change the stack name...

And actually I've didn't even though that someone will try to do it for exhibitor(zookeeper) either.

Both appliances were designed in such way that you can simply deploy more then one of them by simple change of stack version without any kind of modification of senza yaml. On Jan 27, 2016 17:30, "Michał M." notifications@github.com wrote:

Thanks for quick response! :-)

Etcd is not stateless application and you can't really run multiple version of it in parallel.

I'm not sure I'm following why being stateful makes a difference here - why I can't run it in the same way I can run multiple ZooKeepers (Exhibitors) in parallel? E.g. one with stack name "dev-exhibitor" and "prod-exhibitor" and version them independently?

In other words - what could go wrong if Route53 entries were created e.g. like this:

'_etcd-server._tcp.{}.{}'.format(self.manager.me.cluster_token, self.hosted_zone)

rather than this:

'_etcd-server._tcp.{}.{}'.format(stack_version, self.hosted_zone)

so the the entry looks like this: _etcd-server._tcp.dev-etcd-1.hosted.zone

It means you just need to deploy it multiple times with specifying different stack_versions, i.e.: dev, staging, prod

Hmm... Is that a known "pattern" for STUPS appliances? I always thought that StackName is meant to separate independent deployments while version is "per stack" thing. Is my understanding wrong?

— Reply to this email directly or view it on GitHub https://github.com/zalando/stups-etcd-cluster/issues/13#issuecomment-175727167 .

regispl commented 8 years ago

why do you want to change the stack name

Because that's my understanding of how Senza is meant to work - I'm happy to validate my understanding, but docs are not clear enough for me to figure out what are the STUPS best practices - your comment is the first useful info on that I've got so far, so thanks for that :-)

To clarify: to me "stack name" is in some way equivalent of an "environment" (dev / prod deployments), while version is a way to guarantee immutable deployments (which - I see your point now - doesn't make that much sense for stateful services, but still I considered it to be a pattern that originated from stateless services and stateful ones are just an exception here). However, the fact that I can migrate traffic between different versions of a single stack is - for me - an indicator that version is meant to be used as I wrote above, because it only makes sense in that case: migrating traffic from version 1 of a service to version 2 makes sense, but migrating traffic from prod etcd to dev etcd will likely to be considered a disaster, so that's what's confusing me here :-)

Anyway, I see your point now and except the traffic migration case it makes sense to me. I'll check AWS docs re CloudFormation and investigate what others do :-) Thanks for answer!

CyberDem0n commented 8 years ago

To clarify: to me "stack name" is in some way equivalent of an "environment" (dev / prod deployments), while version is a way to guarantee immutable deployments

For stateful applications it does not work at all. When you deploy lets say the new version (2.2.4) of etcd in parallel to the old one (2.2.3), you will simple have two parallel clusters running. The old one will contain some data but the new one would be absolutely empty. May be for your application this is not critical, but usually when application persists some data it expects to get it back :)

but migrating traffic from prod etcd to dev etcd will likely to be considered a disaster, so that's what's confusing me here :-)

It's not really possible to switch traffic from prod etcd to dev etcd by means of aws, because there is neither load balancer nor weighted DNS record existing.

regispl commented 8 years ago

It's not really possible to switch traffic from prod etcd to dev etcd by means of aws, because there is neither load balancer nor weighted DNS record existing.

Cool, I missed that fact - probably I made an assumption based on what other appliances do (the stateless ones), but in fact none of the stateful that I can check in our AWS account allows it from what I can see now (they just fail with an ugly stacktrace for senza traffic).

Thanks again for clarification! :+1: