fabiolb / fabio

Consul Load-Balancing made simple
https://fabiolb.net
MIT License
7.27k stars 616 forks source link

Q: Best practices deploying fabio / Nomad on AWS? #239

Open alexandruast opened 7 years ago

alexandruast commented 7 years ago

current options:

Too may options, really undecided

magiconair commented 7 years ago

TBH, that's a good question.

What we are doing

We have separate consul server clusters for the masters and then distinguish between frontend, backend and infrastructure (e.g. redis, kafka, ...) All frontend nodes have fabio installed and our phy lb in front of them is configured to route to them. Note that this is a dedicated physical setup.

When rolling setups with TF and OpenStack I have created nodes using a golden image with consul/nomad/fabio, a blank image and install packages based on hostname consul00x/nomad00x/fabio00x or a salt configured setup but that's where I left it and it is actually one of the areas we are going to investigate in the following weeks.

Opinion

I consider fabio infrastructure which needs to be there (like your databases, queues, caches, ...) and doesn't need to scale that dynamically. We usually run it and literally forget that it is there.

So you could just run a couple of fabio nodes and be done with it. Or you run fabio on every node, or on every frontend node. Running three extra nodes may be a big deal or not. Ultimately, it doesn't matter.

Docker

As for Docker or not: again that depends on your setup. My personal opinion is that if you are using Go static binaries you don't need Docker and that's the direction I've pushed my team in. I consider this the better long-term investment. However, if you're deploying everything with Docker I'd use fabio inside Docker for consistency reasons unless you have a performance issue.

Recommendation

Given the choices I'd start with installing consul and nomad on every node and deploy fabio through nomad via the exec driver along the other services. Just treat it as another app. I assume you can configure the AWS LB to route inbound traffic to the fabio instances. Then analyze and see whether that's working. fabio doesn't consume much CPU and you can easily scale up, down or out. Keep it simple.

P.S.: I usually ask myself: "What is the simplest thing that can possibly work?"

alexandruast commented 7 years ago

Thanks for the detailed answer.

Let me describe my setup and the issues I see with some of the implementations.

Packer - ami builds for vpn (pritunl), consul, vault, nomad. One for pritunl, one for servers (nomad, consul and vault - don't know if it's the right choice) and one for nomad nodes.

Blank AWS account -> terraform global (dns, iam) -> terraform multiregion (everything else) -> terraform global (inter region discovering nomad over internet via security groups - again, don't know if it's right). At this point, I just log into vpn and fire a job on nomad (the mongodb database for the vpn is external, so it just works without any manual intervention).

CI environment (jenkins, nexus) is deployed as nomad jobs (same, don't know if it's right).

The ways I tested fabio:

Inside nomad as docker container - had problems with it registering multiple checks via consul - maybe because I put consul.service.consul as consul server. The checks it registered I had to remove manually from consul.

Inside nomad as exec - worked fine, more control, but running nomad as root is not very attractive (required for the exec driver), and registering all the nomad nodes in amazon alb/elb seems counter-intuitive.

Standalone - worked fine, although the instance type required to actually get a decent network performance is overkill.

I am thinking of creating a special type of nomad node with exec driver and run fabio along with other exec and docker tasks on these special nodes (although there is no exec task at this time other than nomad). Having 3 or 5 instances registered in elb and not all of them (200+ in prod) looks better.

Another option that I think of is to use latency based dns in route53 and directly hook fabio into it with a lambda function (although I don't know if fabio can even work like that at this time).

magiconair commented 7 years ago

I have to think about this a bit more and would encourage/ask the community to chip in with their experiences and setups. If you consider fabio infrastructure and run it as an app on the node itself you could alleviate your root concerns with the #195 patch. Since fabio is stateless there is no need to deploy it along with your app for the same reason you don't deploy nomad every time.

kaspergrubbe commented 5 years ago

Thanks for your question, and for your write-up @alexandruast, it would be interesting to know what you ended up doing.

It would also be interesting to know what other people do.