docker opinion. - Githubissues

happysalada commented 4 years ago

First of all, thanks for the great post!

It was interesting to me to see that you don't use docker at all. I see docker everywhere, but it's a choice I feel a little ashamed to say I've never really questioned. Was it a conscious choice for you to not use docker (making an image for each service and deploying the container)? I've heard there might be a performance cost to using docker, but I've never seen anything that points to the cost of using it. For me the benefits in this regard are the ease of use of updating software for example. The other one being deployment, you just build a new image and 10s later it's deployed. I'd be curious to know your opinion on the use of docker.

Thanks again for the very informative cover!

fpereiro commented 4 years ago

Hi @happysalada !

I used Docker quite extensively a few years ago and my experience was mostly positive. However, after some time I considered whether I could do without it - since I'm always trying to simplify things - and I found out that it was definitely possible.

The advantage I had using Docker was the possibility to build an environment from scratch using the same rules every time. This gives you a lot of reliability and makes the setup trivial for others too. For the types of systems I usually build, I can however get this without using Docker. This is because:

I usually deploy one app per instance/server, so I don't have overlaps between the programs needed by app A and those needed by app B. And when I deploy multiple apps to the same instance/server, they usually use the same versions of node and redis, so there's no problem there either. In other words, app A and app B can be oblivious to each other in terms of provisioning.
I fully control the instances/servers I use, so there's no risk of someone logging in and installing things which may inadvertently break the programs on which the apps rely.

If any of these conditions were absent, I would seriously consider using Docker containers to keep an isolated and replicable context for each app. As it stands, however, I can do the same just by using single instances.

I don't like the concept of a Docker image as the "replicable environment" that you ship - when I used to work with Docker images, I'd make sure that they would be equivalent to what you'd get by running a Dockerfile from scratch. I'd much rather have a short set of instructions that can create the environment from scratch - in other words, I want a precise recipe that takes half a page, instead of a magic tap where the substance comes out, but which I don't understand fully. Hope the analogy makes sense.

As for speed of deployment, I find that by deploying directly with the OS through ssh takes ~15 seconds, so I don't see any time-saving potential there by using Docker.

Using Docker brings another set of elements to the picture, like images, dockerfiles, port mapping, special commands (even if they're a few), so the overall surface of the architecture becomes bigger, in my opinion.

There might be other legitimate reasons for using Docker that I'm missing, and it also may boil down to personal taste. I just haven't found it necessary lately.

Feel free to reopen the issue if you have further questions. And thank you for bringing this up!

happysalada commented 4 years ago

Thank you for your great answer!

One of the advantages that docker provides that I find hard to replicate is when you deploy. With your current setup, when you want to deploy a new version, do you stop the old version, then start the new one? I don't think having a little bit of downtime everytime you deploy is that much of a bad thing, for most apps, it's not a problem. Docker gives you the ability to start a new container, wait until it's ready and then swap the new and the old one (there can be problems there of who should handle what requests how, but in practice they are very few).

What if a deployment goes wrong, can you roll back easily? I only have had to roll back a couple of times, so potentially not that needed as well.

last regarding performance. Do you have a private link running between your machines? (if running over the http network, you might experience latencies) Running two containers in the same docker instance (redis and node for example), you have almost no latency between the two. (I have never tried this with redis, but with postgres, so I imagine it's the same)

The one thing that docker enables though, is the use of kubernetes. Kubernetes is complex, that's for sure, but it can take care of the scheduling of your containers for you. Making sure you don't have a machine almost not used and another one maxed out. I guess here, you can always add new machines as needed, and so it might only be useful for scaling.

last unrelated question, do you use VMs or baremetal servers for your machines? (out of curiosity)

fpereiro commented 4 years ago

Hi @happysalada! Great questions ;).

Here's an outline of the deploy process I do with bash: https://github.com/fpereiro/backendlore#deploying-the-server . Once the code & dependencies are updated, mongroup (the program that runs node) restarts the node process. Downtime is < 1s, the node process (and associated children, if we're using cluster) exit cleanly and then the fresh node is ready for action. I'm actually quite happy with this system and has worked well for me in the past. If I wanted absolutely no downtime, I would do it by stopping the redirection of traffic from the LB to that api server in the first place and performing the deploy once that api is done processing all its pending requests - but so far I didn't have the need...
Rolling back would consist of checking out the desired (previous) commit and deploying again. Of course, if your new code broke the database (yipes), it will take more than this, but in that case you'd also need a database restore and that goes beyond the scope of what we're discussing. If the deployment goes wrong not because the code is wrong but something else is failing on the server (permissions, disk space), then the issue is a major provisioning/deployment issue; after the fire is put out, it is essential to find out the root cause(s) so that should never happen again.
Regarding performance, I really like the idea of building an infrastructure of machines that can communicate through the open network. While this adds security, performance and consistency challenges, I believe those challenges make the architecture better. At the beginning of the project you can use just a single machine where everything is hosted, except perhaps for the DB replica, so no need to use private links anyway. But when the project scales and must be resilient, it would be necessary to move to a multi-region setup and private links then might not be available (I might be wrong, but I don't think private IPs work between AWS regions, for example). So, to keep things architecturally simple, I try to keep all the servers talking through the open network. It does help a lot if the server/instance provider is well connected to the main internet exchanges. AWS, DigitalOcean and Hetzner (to name a few) all do this well.
I don't use Kubernetes and try to keep things run semi-manually, at the (small) scales I'm operating. Instance load is usually distributed well thanks to a LB (or even DNS records), so it's uncommon to see a scenario where one instance is being used much more than the other. When I start managing a larger architecture that needs to grow and shrink semi-automatically, I might reconsider this stance. I do feel, however, that I'd anyway rather have a solution that is smaller and lighter conceptually than Kubernetes, and which I can understand in terms of procedural code, but I'm getting ahead of myself. Better to talk about this when I actually encounter the problem :). With that said, if you like Kubernetes and you feel it solves your core problems, it's probably the right solution for you!
I've always used AWS or DigitalOcean instances but I really want to try Hetzner's bare metal servers. I'll definitely share my experience when I get to that point - and also the code, since the projects I'll run on that architecture are both open source services. I'm excited to try this stuff out.

Feel free to ask further questions; I hope to answer a bit faster next time. Cheers!

happysalada commented 4 years ago

More information on Hetzner baremetal if you're interested. (my 2 cents) I'm using the 8 core 64 GB at 69 Euro/month). It's very powerfull! Running my db and web server on the same machine gives me no latency at all on db requests. The 8 cpus (dedicated cpus, which are a world appart from shared ones), are enough for just about everything. (you also get a ton of disk space (960 Gb in my case), so that if you run your db on the machine, you can keep using that machine for a while). If I had to do it again, I would probably choose their 39 euro baremetal machine (only 6 cores, 64Gb, and 480 gb of nvme ssd). It's already very powerful.

The one problem I experienced in that in one year of use, you get server that crash when they start getting old (happened to me once in the last year). You just need to ask them to change it, and you are good for another year. When the server crashes though, you need to do a physical restart (can be done from the web console). If you are not around when this happens, this can obviously lead to quite a lot of downtime. So while baremetal performance is absolutely awesome, if you don't have another layer to control for availability, you are expose to the once a year (so far) hard crashing of server. So if using baremetal, you kind of have to have some redundancy.

Thanks a lot for all the answers, very informative!

fpereiro commented 4 years ago

Interesting stuff!

My reason for choosing Hetzner (in the future) is the sheer amount of size and power you get. It effectively allows me to compete in price (somewhat) with big players. Glad to hear you're happy with them!

I also definitely have that concern about server crashes; I'm curious whether that crash is at the OS level (say, Ubuntu gets stuck somehow) or it reflects an underlying issue between the software and the hardware. My plan is to programatically restart API servers every week and DB/FS servers every 2-3 months manually, through planned downtime, to keep them fresh. I wonder whether that may reduce the chances of encountering a scenario like the one you described above (and which is very valuable, because it actually happened to you!). In any case, it'd probably be necessary to have the DB/FS servers with a master-master structure, especially if running them on baremetal.

Good stuff!

happysalada commented 4 years ago

I don't think it's at the OS level.

I'm running ubuntu 18.04 with just a docker installation on it
when I ask them to physically change the server (but keep the disks), the crashes stop.

When the server crashed, I couldn't even ssh into it. Sending a "reboot" signal from the web interface didn't work either. It had to be a "physical" restart (whatever that means).

Curious to know how your baremetal experiments work out!

I'm currently going to stay on one machine, and tolerate the once a year downtime until things get much bigger.

Your article got me curious as to experiment without docker at all. I'm curious especially performance-wise. I feel once you learn docker, there is no more complexity to it. I'll try a no-docker setup in the near future, thanks for giving me the idea!

fpereiro commented 4 years ago

It must be a hardware thing then, for sure. I've never experienced anything of this sort running AWS instances or DigitalOcean droplets. Thanks for confirming my fears on this :). I'll still go ahead with baremetal, but use a much more conservative approach. I'll be sure to share my experience.

Performance wise, I'm not sure there's much to gain by skipping Docker - at least when I used it a few years ago, the performance difference was negligible, and we were able to run a system with a lot of traffic (and performing a lot of disk reads and CPU intensive operations) without problem. I would expect that Docker has gotten better and better performance wise during all these years. But it's worth to try it!

Thanks for sharing your experience with Hetzner baremetal, it is very welcome.

Olshansk commented 4 years ago

Just wanted to chime in an say that I found this discussion very interesting and informative! I'm really glad this type of thing is discussed in a public forum :)

I don't really have much to add aside from the fact that I also like using Docker for most of my projects, but have never experimented with Hezner baremetal. The discussion in this thread is making me think that I'll definitely go down that approach next time I deploy something new.

fpereiro commented 4 years ago

@Olshansk if you try out Hetzner baremetal, I'd love to hear about your experience either here or privately. Best of luck with your upcoming projects!

fpereiro / backendlore

docker opinion. #11