JonasProgrammer / docker-machine-driver-hetzner

Docker machine driver for the new hetzner cloud API
https://jonasprogrammer.github.io/docker-machine-driver-hetzner/
MIT License
426 stars 53 forks source link

Support the creation of servers with only private networks #84

Closed eedugon closed 1 year ago

eedugon commented 2 years ago

Recently Hetzner has added the support of servers belonging only to private networks, without any public interface.

This is a great achievement for security and architectural purposes, and it would be great if this driver for Rancher supported the creation of the servers in this new way.

In order to create a server without public IPv4 and v6 Hetzner has added 2 new flags documented here: https://docs.hetzner.cloud/#servers-create-a-server

public_net object

With the CLI (hcloud) we only need to use the parameters --without-ipv6 --without-ipv4 when creating the server, and with the go library used in this project I assume we should just add the public_net new object with the enable_ipv4 and enable_ipv6 booleans set to false when creating the server.

Of course this should be used together with the existing option of the driver to "use private networks".

Hope you find this proposal interesting.

JonasProgrammer commented 2 years ago

Should work with 3.8.0 now, could you try it please?

eedugon commented 2 years ago

Thanks a lot for the quick PR! Looking forward to trying this out.

As a Rancher user I use this driver together with the UI/JS package that takes care of the calls (https://github.com/mxschmitt/ui-driver-hetzner), I haven't used the driver directly myself.

Should I modify the other code (it belongs to a different GH repo) in order to try this out?

Or what's your suggestion to try this? The code and changes look good to me and I believe it will do exactly what I was suggesting.

eedugon commented 2 years ago

@JonasProgrammer : I have tested the 3.8.0 driver together with a customization made on https://github.com/mxschmitt/ui-driver-hetzner and it works fine. I'm able to create servers with only private networks. Thanks a lot, I think we can close this issue!

hoerup commented 2 years ago

@eedugon did you commit your changes to ui-driver-hetzner to a fork / PR somewhere ?

eedugon commented 2 years ago

@hoerup : nope, I couldn't open any PR because I was unable to build the components.js properly from the source. I believe that repository (ui-driver-hetzner) is not really aligned with the public delivery of component.js for Rancher 2.x, so it looked impossible to me to create a proper and valid component.js similar to the published one from that repo, hence I couldn't contribute at all :(

Maybe it's because I'm missing something silly as I'm not a developer but a sysadmin.

I created 2 issues there to make note of that:

What I did was an ugly hack towards the published components.js to make use of the new option of the driver.

I explain that in a comment here.

If you want to try it out feel free to take my changes from https://gist.github.com/eedugon/66b8f7fce3d059faefe790bc5a7190be. Remember that you will have to host that file together with component.css and hetzner.svg in a web server.

martyrs commented 1 year ago

Just wanted to note that servers without public networks (IPv4/6) are not allowed to talk outside world. Maybe that make sense to state this in docs. Cheers.

eedugon commented 1 year ago

@martyrs : in order to have external connectivity keeping only the internal interface you would need to deploy your own NAT gateway / firewall in the network (until Hetzner offers that as a service, which I don't know of it's on their plans).

I hope to publish soon a how-to guide to accomplish this setup, because it's not difficult and works pretty well.

martyrs commented 1 year ago

@eedugon yeah should be pretty straightforward using wireguard. (wireguard is available on hetzner cloud images)

Wouter0100 commented 1 year ago

We have been using this feature for the past couple of days quite intensively. Unfortunately though, it does not work reliably. Not at docker-machine-driver-hetzner's fault, but Hetzner's fault.

I have seen 3 cases:

image

The first case is the one I see most often. In that case the docker-machine-driver-hetzner fully hangs waiting for the machine to get in a ready state. If my manual start succeeds, the docker-machine-driver-hetzner continues. I have seen 5-6 machines at the same time in this "error'ed" offline state.

Hetzner is aware of this problem and not likely to fix anytime soon. Even though they don't mention the fact that the Private Networking feature is in beta/alpha or unstable anywhere, they do not seem to prioritize a fix for this (even though it is seemingly in production). I quote:

we don't have any solution for that currently. The only workaround is as mentioned to create the servers without private network and attach them later to it.

We know that this is a feature we offer and it should work as expected. We are working on it, however we can't give you any ETA.

Their solution is to start (or only create, not sure) an Instance w/o Private Network and attach it later. Should we perhaps:

JonasProgrammer commented 1 year ago

I can see how this is causing headaches, so I want to try to at least somewhat mitigate this. On the other hand, I would like to keep the impact limited. as I am not really keen to introduce too much complexity (read: things I can do wrong) to work around something (hopefully) temporary.

Add a timeout for machine creation. Then delete and retry?

Not really a fan of this, as it would cause headaches of all sorts. This would introduce a whole lot of complexity pretty much everywhere, and the scope is not really clear: Do we do this for servers only? If so, why? If not, should every single API call be retried/where to draw the line? Driver aside, this could be implemented on a higher level, however. Just have some watchdog process run alongside and then send a signal to docker-machine, if still running. Granted, the cleanup could be a bit messy due to possible race conditions but that is the case even now, if you interrupt the creation in the wrong phase.

Add an option to first create the machine w/o autostart, attach the network and then start it?

That I can get behind. It pretty much affects only the creation stage and should be fairly straight-forward (remove from CreateOpts, add additional call later). The question is, is the problem really just related to the attachment on creation? Or is it creation with rapid attachment following it? If the latter is the case, then we will need a delay in between. Perhaps that in itself should be an option?

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

perlun commented 1 year ago

@JonasProgrammer I guess this could be closed by now? No feedback in 2 months, and does the original problem even exist any more?

JonasProgrammer commented 1 year ago

I'm not entirely sure everything is fixed, given the fact that we only recently introduced a flag to wait on server creation so outside orchestration does not run in loops, but I guess the flags are there and the lack of response is an acceptable argument.