deis / controller

Deis Workflow Controller (API)
https://deis.com
MIT License
41 stars 53 forks source link

add deis domains:transfer support #1218

Open deis-admin opened 7 years ago

deis-admin commented 7 years ago

From @olalonde on August 9, 2015 17:50

I'm trying to do blue-green deployment with Deis. So I have two Deis apps (app-green and app-blue). At any time either app-green OR app-blue should be live (e.g. serve traffic for app.domain.com). Everything good so far except for one thing: is there any way to atomically change app-green and app-blue's DNS configs so that I can redirect traffic to one app or the other for my domain name atomically?

e.g:

deis domains:remove app.domain.com --app app-green && deis domains:add app.domain.com --app app-blue

Any way to make this atomically so there is no downtime? Other suggestions?

Copied from original issue: deis/deis#4237

deis-admin commented 7 years ago

From @bacongobbler on August 9, 2015 17:59

Hmm... It might be easier to cut it over with a domain record outside of Deis. Try this: start with a CNAME record for app.domain.com pointing to app-green.domain.com. Then, change the CNAME to app-blue.domain.com when it is time to cut over. That might be the easiest way to make sure there's no downtime.

deis-admin commented 7 years ago

From @olalonde on August 9, 2015 18:5

Oh right... should have thought about this -_-

deis-admin commented 7 years ago

From @nathansamson on August 9, 2015 18:11

How would that work? As dat as understand this is on the same deis cluster. So the routers would work on hostname to route to the correct application.... In fact I never setup cnames with app-name.deishost.example but just to deis.deishost. example...

To answer the original question do you actually observe downtime? As far as I would guess by the time the routers refresh their Config the new domain should already be there (as the ttl is around 10s afaik) so unless you are really unlucky to hit the small refresh window you should be fine.

Nathan Op 9 aug. 2015 20:05 schreef "Olivier Lalonde" notifications@github.com:

Oh right... should have thought about this -_-

— Reply to this email directly or view it on GitHub https://github.com/deis/deis/issues/4237#issuecomment-129222521.

deis-admin commented 7 years ago

From @olalonde on August 9, 2015 21:30

Oh yeah @nathansamson is correct, I can't just change the CNAME.

To answer the original question do you actually observe downtime?

I haven't tried yet. Oh I see what you mean... if I change the domains quickly with deis domains:..., the router might only restart once right?

deis-admin commented 7 years ago

From @nathansamson on August 9, 2015 22:15

That is what I expect...

Obviously there is still a small window where things can fail, and this window can be bigger if your client has slow / bad internet connection (even worse if the command fails, and you manually need to restart it)

I guess a real atomic replace of domains in the cotroller isn't a bad idea, but I think the current situation is probably good enough....

What is your actual reasoning for doing cutovers like this? I can understand if they are different deis clusters (eg upgrade of your cluster), but for application restarts I don't really see the need

deis-admin commented 7 years ago

From @krancour on August 10, 2015 14:16

@olalonde Deis already does blue/green in a manner of speaking. During any release (deployment or config change), the controller waits for the new containers to be alive before re-routing incoming traffic to them and killing the old ones. With the 1.9.0 release, @bacongobbler improved upon that substantially by allowing for you to insert your own custom application health check into that process.

The above might be good enough for most scenarios. Where it breaks down is if you have more extensive verification testing that you want to do post-deploy, before flipping the switch.

@nathansamson is correct that the CNAME magic won't work. If you request myapp.example.com, it doesn't matter if that's a CNAME for myapp-blue.my-deis.example.com or myapp-green.my-deis.example.com. In either scenario, when the request hits a router, the host header will still say myapp.example.com, and that router will send it wherever the configuration says to at the moment. Whether it routes to blue or green will depend entirely on which of those is configured at that time to answer requests for myapp.example.com.

You can quickly alter the configuration as @nathansamson has suggested. To dispel one bit of confusion, however, that does not require the routers to restart. Routers are dynamically re-configured periodically with a service called confd-- without any downtime.

The thing that will dictate whether you have any downtime or not is how fast you can update the configuration. There could be a very narrow window where you've removed the domain from "blue," for instance, but haven't added it to "green" yet. This being the case, there might be a strong case for a feature request here. As a workaround in the meantime, you could probably minimize that window by scripting those config changes.

deis-admin commented 7 years ago

From @olalonde on August 11, 2015 6:54

@krancour thanks for the clarifications. @nathansamson the reason I'm doing this style of deployment is that once in a while, I need to completely wipe out my app's database and rebuild it from scratch but this process takes a few hours (syncing the Bitcoin blockchain). In the meanwhile, I want the old version of the app/db to keep running. I suppose I could just rebuild the database and use deis config to tell my running app to start using the new database when it's ready. But if the database structure changed in a significant way, my old app code will fail. There are other ways to get around the problem (e.g. do proper database migrations) but it would involve more work for me and I'd rather go with a simple process. I could also have two clusters and do the switch over at DNS level but I can't afford another cluster.

I can accept some downtime but it would be nice to have this feature :) If we can agree on a deis command syntax or process to do this, I could attempt to do it though I don't guarantee I will figure it out all by myself. Maybe one solution would be to have deis domains:add some.domain.com --force which would remove some.domain.com from other apps that use it (and make sure nginx doesn't reload its config in the middle of the process). Related CoreOS issue: https://github.com/coreos/etcd/issues/860

deis-admin commented 7 years ago

From @krancour on August 11, 2015 16:55

Maybe one solution would be to have deis domains:add some.domain.com --force which would remove some.domain.com from other apps that use it

That's a useful thought. Ping @Joshua-Anderson. Thoughts?

deis-admin commented 7 years ago

From @Joshua-Anderson on August 11, 2015 18:3

:+1: It wouldn't be too difficult to do, but I probably don't have time this week, as I have a lot I need to finish. I may have time to tackle this in the next month or so.

deis-admin commented 7 years ago

From @bacongobbler on August 12, 2015 21:11

To add my thoughts to this, I would like a native command rather than a flag. Something like deis domains:update mycustomdomain.com (with --app as a flag for non-interactive shells) would allow a user to patch a domain app from one to another. Of course, the user must have permission to both applications, but essentially that would allow us to update the db records without any deletions, which in turn will ensure that the router will re-template the domain route without any downtime. That way we can re-use this command for more than just app transfers in the future.

Does that sound like something you'd want to tackle, @olalonde?

deis-admin commented 7 years ago

From @olalonde on August 13, 2015 18:35

@bacongobbler Yep, I'd be happy to try.

deis-admin commented 7 years ago

From @krancour on August 13, 2015 20:44

@olalonde first, thanks for the willingness to take a crack at it. Before you get going, it's worth mentioning that our intention is to shortly transition off of the python-based deis CLI and onto a new one written in go. The latter is functionally equivalent (save for a few minor bugs) and is being put through its paces now. I'm mentioning this so you'll know to apply your efforts toward the go-based client and not the old python-based one.

deis-admin commented 7 years ago

From @olalonde on August 15, 2015 12:57

@krancour thanks for pointing out,

So far, I managed to install everything locally (local cluster seems to be working) and looked around the code. If I understand correctly, I would need to add a method around here (https://github.com/deis/deis/blob/v1.9.0/controller/api/models.py#L1002) that only gets executed when a POST parameter is present (e.g. force or update) which would remove the old domain before carrying on with adding the new one (current behaviour is to refuse adding the domain if it is not unique). I suppose I would also need to make sure that the user has permission to the app that currently owns the domain. I think the etcd code _etcd_publish_domains will work as is and will just overwrite the previous app name with the new one (https://github.com/deis/deis/blob/v1.9.0/controller/api/models.py#L1190)

I haven't been able to run controller tests (no Python/django experience here so I'm probably doing something wrong):

$ make test
venv/bin/pip install --disable-pip-version-check -q -r requirements.txt -r dev_requirements.txt
/Users/olalonde/code/go/src/github.com/deis/deis/controller/venv/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/67/wj749ptn3jlb4_nlf9nphyy00000gn/T/pip-build-QgXWTM/psycopg2
make: *** [setup-venv] Error 1
$ python --version
Python 2.7.6
$ pip --version
pip 7.1.0 from /Library/Python/2.7/site-packages/pip-7.1.0-py2.7.egg (python 2.7)
deis-admin commented 7 years ago

From @Joshua-Anderson on August 15, 2015 14:38

@olalonde That seems to imply you don't have a SSL library installed. Have you installed openssl where python can find it(via homebrew, etc)?

deis-admin commented 7 years ago

From @olalonde on August 15, 2015 14:47

brew install postgresql followed by make setup-venv fixed that problem. Also had to brew install shellcheck and run make postgres.

Cryptophobia commented 6 years ago

This issue was moved to teamhephy/controller#40