allow switching gateway modes without recreating/restarting the container

bfg100k commented 3 months ago

Current implementation requires recreating the container whenever we want to change the gateway mode. By moving the firewall logic to a standalone script, one can now switch gateway modes on the fly.

Proposed pull request does not change current behaviour (i.e. you can still configure gateway mode during container creation via ZEROTIER_ONE_GATEWAY_MODE) but adds ability to change gateway mode when container is still running via the following command docker exec <containername> gatewaymode [status|inbound|outbound|both|none|disable]

For context, I normally have the container set to outbound for remote backups and both for when I travel (so I can remote in to provide sys admin support). In the past I have to manually recreate the container pre and post each trip. Lately I decided to automate this using Home Assistant (triggered via presence detection). Although I could script the recreation of the container with the required gateway modes, it would break existing connections during the recreation process (not good for backups!). The new approach takes care of this by only amending the required rules for the desired gateway mode.

zyclonite commented 3 months ago

looks great, i am away from my computer for the next few days. if you are ready i can merge it to main and you can test the image before creating a release.

bfg100k commented 3 months ago

yep happy to proceed. @Paraphraser appreciate if you can give it a test as well since you are also an active user of this variant.

Paraphraser commented 3 months ago

I know I'm probably going to sound a bit reactionary here but I'd like to explain my thinking behind proposing #12 as a separate image ("-router") rather than trying to bolt the "router" functionality onto the existing "client" image.

It was because, when I went to DockerHub, I saw a very large download count and, to me, that meant altering the existing container and screwing something up carried the risk of affecting a lot of people. Going with a separate image avoided that risk.

DockerHub currently shows 500k pulls, albeit for all images and tags. There may be a way of figuring out how many pulls a given image has had but I don't know how to do that so I'll just go with the base number and say it's large enough for me to continue to fret about how the prospect of something going wrong has the potential to add misery to a lot of lives, and how I think that should inspire caution.

To be perfectly frank, I can't actually understand the use-case you've proposed. Back in #12 days, I couldn't see the point in anything other than what has become:

ZEROTIER_ONE_GATEWAY_MODE=both

but I preserved the inbound and outbound alternatives. I think they were in examples on the ZeroTier web site but my memory is a bit hazy on that.

For me, the notion of switching modes provokes a "but why on earth would anyone want to do that?" reaction. You've given the example of triggering this via presence detection but that hasn't caused any "ahah" moment either.

See, when I'm at home, I'm assuming the "both" is providing LAN-to-LAN connectivity with another "both" 1,200km away where I do remote support. When I'm on the road, I'm assuming I can fire up a client and get back home and still get to the other site. For me, ZeroTier-router is (and always has been) set-and-forget. The idea of changing modes at all let alone on-the-fly takes me right back to "why?"

More to the point, when I'm at home and I want to test my "remote" access, I want to be able to switch off WiFi on the phone to force a 4G connection (5G is a pipe dream where I am), then tether the laptop and be sure that what I'm seeing is what I would see when I'm on the road. The last thing I'd need would be "presence detection" second-guessing what I'm trying to do.

I don't see any security benefits from switching modes. And I mean zip, zero, zilch and none. At its core, it seems to me that you either trust the cryptographic key relationship between clients and ZeroTier Cloud, and the client authorisation process in ZeroTier Central, or you don't. If you don't then I reckon that's the same as concluding that ZeroTier is insecure and not fit for purpose. To put this another way, if switching modes is intended to be a way of enhancing security, I'm not seeing that either.

I also find myself wondering what happens if you've triggered a mode switch (which I presume is done via a yet-to-be-documented docker exec call) and then the container resets for some reason (eg an apt upgrade replaces some of the docker guts and resets all running containers). My understanding is it is going to take the value of ZEROTIER_ONE_GATEWAY_MODE from the compose file. Yes/no? I don't know about you but it would drive me nuts if I spent ages tracking through routing tables trying to figure out why something wasn't working, only to find presence detection thought I was at home when I wasn't, or vice versa.

That said, if I did see a need for this kind of functionality, I'd probably look at adding the mosquitto clients to the container, so the container could always determine the mode it should be in, and have the presence detection mechanism post a retained message declaring the correct state.

I don't make the decisions here and I don't feel any great sense of "ownership" in the original -router material. My opinion is just one voice. But I'd really rather see either:

a third variant producing a third image (eg -dynrouter) which people who see the need for that use-case can then pull if they want it; or
implement your ideas as a fork and push your own "router" image to DockerHub; or
absent some compelling evidence that switching modes dynamically is more than a niche requirement (ie you're right about the wider need and I'm wrong), you just do it for yourself as a local Dockerfile built on top of the existing -router image.

Paraphraser commented 3 months ago

yep happy to proceed. @Paraphraser appreciate if you can give it a test as well since you are also an active user of this variant.

I'd rather get your response to my comments first.

For me, a full test involves taking a chance that the "other" host 1,200km away won't stuff up. If it does, I'll lose connectivity and need a non-tech-savvy person to either return the whole remote Pi by post or get a person whose fingers are not at all nimble to swap in a replacement SD that I would also have to send by post. So, in a practical sense, this is as close to a hard "no" as I usually get.

bfg100k commented 3 months ago

Always good to have another point of view Phil. You have made the assumption that ZT is used to connect multiple remote nodes/networks in trusted environments (or those that you control) which then makes all modes other than "both" pointless. This I agree. However, if the remote nodes are in locations/networks that you don't control, then perhaps it make sense to add an additional layer of security.

I first explored ZT gateway mode for a project where we deploy edge devices in the "field" (aka customer premises) and needed a secure way to push updates, monitor and provide remote support to these devices. ZT was perfect for this as its lightweight and can easily hole punch any network. The edge devices run ZT client and we have a ZT gateway that connects our support/server network. Because the physical networks and locations where these devices reside are not within our control, we naturally assumed a defensive posture by blocking all incoming connections on the ZT gateway (so gateway mode outbound).

In my personal capacity, I use ZT in a similar fashion. My remote backup server is located at a friend's home and ZT makes it easy enough to just drop and go. I don't question or audit how he set up and maintain his network but just to be safe, I make the assumption that the device could be compromised hence best block incoming traffic since there never should be any in the first place. And in case you are curious, my backup is encrypted at source and I have multiple copies (local + remote) so I can afford to lose one (or two).

For on the go access into my home network, I suppose I can setup another ZT network where all devices are trusted but I don't really see the point of doubling the administration effort where I can easily flick the switch to change the gateway mode on this existing ZT network just for that duration.

Bottomline is gateway mode is there not because I don't trust the implementation of ZT but rather the concern is that one or more devices on the ZT network is/are in potentially "hostile" locations and subject to attack/compromise.

Hope that make sense?

Paraphraser commented 3 months ago

It makes sense and it's the kind of thing I might once have done myself in a past life with a multi-port router, upping and downing circuits for specific requirements - I'd likely also bolt firewall rules on top.

Candidly, although I now "get" your use case, I can't say that I'm completely persuaded this is necessarily the best solution, and I think many of the points I made earlier (eg unplanned reset causing mode switch, absence of doco, and possible alternative deployment approaches) stand.

That all said, my comments were only ever intended to be food for thought. Plus, I also think there's nothing worse than putting up a proposal and getting nothing but crickets in reply. At least you know someone is listening. 🤓

Ultimately, you're the person who did all the work so it's your call.

bfg100k commented 3 months ago

Not trying to convince you but really just for my own clarity -

1) fluffiness of presence detection trigger HA can be configured to provide a reliable way to detect presence by leveraging multiple sensors (or trackers) - Detecting a mobile device on wifi is one and GPS location via mobile app is another. In my case, I only want the mode to switch when I'm not on local wifi, >100KM away from home AND been away for more than 24hrs. As a failsafe, triggering the mode change can be done via a switch on the HA dashboard which my non-techie partner (and kids) can do manually (vs getting her to ssh in, change a text file and run a command).

2) handling unplanned mode switch + working out what mode the "router" is in The proposed change includes a command option (and is also the default option if run without arguments) to show what mode the gateway is in. In HA, I have a binary sensor mapped to this command so I have a real-time view of the gateway mode (and docker status) at all times. In terms of unplanned reset of the container causing unexpected mode switch (good point here btw), this can be addressed by adding an automation that monitors for unintended mode switch.

3) deployment alternatives I do maintain my own fork of this repo and run my own custom docker image as I do like to tinker and experiment. This feature has been useful to me hence I figured I might as well offer it upstream in case it benefits others. So having said that, let's keep it on hold until one or more chimes in to upvote this feature?

zyclonite / zerotier-docker

allow switching gateway modes without recreating/restarting the container #30