telepresenceio / telepresence

Local development against a remote Kubernetes or OpenShift cluster
https://www.telepresence.io
Other
6.52k stars 513 forks source link

Telepresence systemd service(s) #2958

Open spnngl opened 1 year ago

spnngl commented 1 year ago

Please describe your use case / problem. We'd like to use telepresence connect as any others VPN/networking tools with systemd to monitor/reboot it and avoid launching root-level daemon by hand. Possibility to send logs/errors to stdout/stderr to have them in systemd-journald.

Describe the solution you'd like One systemd service/template to connect to one or multiple cluster.

Describe alternatives you've considered I've made some services, one root & one user, to do it. They are basic and needs improvements/suggestions.

Root daemon:

# /etc/systemd/system/telepresence.service
[Unit]
Description=Telepresence daemon
Documentation=https://www.telepresence.io/docs/latest/quick-start/
After=network-online.target multi-user.target
Wants=network-online.target

[Service]
# Ensure we're clean and no old socket is used
ExecStartPre=rm -f /var/run/telepresence-daemon.socket
StandardInput=null
ExecStart=/usr/local/bin/telepresence daemon-foreground %L/telepresence %T
ExecStop=/usr/local/bin/telepresence quit -s --no-report
LockPersonality=yes
MemoryDenyWriteExecute=yes
NoNewPrivileges=yes
ProtectProc=invisible
ProtectClock=yes
DeviceAllow=/dev/net/tun
Environment=XDG_CACHE_HOME=%T
Environment=XDG_CONFIG_HOME=%T
ProtectControlGroups=yes
ProtectHome=yes
ProtectKernelLogs=yes
ProtectKernelModules=yes
ProtectSystem=full
Restart=always
RestartSec=0
RestrictNamespaces=yes
RestrictRealtime=yes
RestrictSUIDSGID=yes

[Install]
WantedBy=multi-user.target

User daemon:

# $HOME/.config/systemd/user/telepresence-connect@.service
[Unit]
Description=Telepresence connect service

[Service]
Type=forking
ExecStart=/usr/local/bin/telepresence connect --cache-dir=%T/telepresence --context=%i --request-timeout=2m --no-report
ExecStop=/usr/local/bin/telepresence quit --stop-daemons --no-report
PrivateTmp=yes
MemoryDenyWriteExecute=yes
NoNewPrivileges=yes
ProtectProc=invisible
ProtectControlGroups=yes
ProtectHome=read-only
ProtectSystem=full
RestrictNamespaces=yes
RestrictRealtime=yes
RestrictSUIDSGID=yes

Versions (please complete the following information)

Additional context

thallgren commented 1 year ago

This looks interesting, and aligns well with my own thoughts, at least when it comes to the root daemon. Starting it as a service also brings the advantage that no root permissions are needed when connecting.

For the connector, I've envisoned a service that runs as a non-root user and executes /usr/local/bin/telepresence connector-foreground, leaving the choice of the actual kubecontext to use to the user, but I realize that your solution here is more VPN-like.

spnngl commented 1 year ago

This looks interesting, and aligns well with my own thoughts, at least when it comes to the root daemon. Starting it as a service also brings the advantage that no root permissions are needed when connecting.

It is one of the main advantage for users.

Things I'd like and which are not yet set:

For the connector, I've envisoned a service that runs as a non-root user and executes /usr/local/bin/telepresence connector-foreground, leaving the choice of the actual kubecontext to use to the user, but I realize that your solution here is more VPN-like.

I wanted this VPN-like feeling ^^

The main disadvantages of this user service are:

What do you like to see in those services ?

PS: I also made a systemd auto-updater for telepresence

# /etc/systemd/system/telepresence-update.service
[Unit]
Description=Download latest telepresence version from official site
Conflicts=telepresence.service
OnSuccess=telepresence.service

[Service]
Type=oneshot
RemainAfterExit=no
ExecStart=curl -fL https://app.getambassador.io/download/tel2/linux/amd64/latest/telepresence -o /usr/local/bin/telepresence
ExecStart=chmod a+x /usr/local/bin/telepresence
# /etc/systemd/system/telepresence-update.timer
[Unit]
Description=Launch telepresence-update.service unit periodically

[Timer]
OnBootSec=15min
OnUnitActiveSec=1w
Persistent=true

[Install]
WantedBy=timers.target
thallgren commented 1 year ago
  • telepresence root daemon does not work without the XDG_* envvar

It should (that's why we pass the logdir and config dir as arguments). What problem do you see when it isn't set?

  • maybe add a systemd.socket ? multiple ones for multiple connections ?

There can never be multiple connections. A connection is ultimately what controls the TUN device and the daemon will never have more than one session running.

  • logs in systemd-journald

I did look into this a while back, also had a discussion with @LukeShu (the author of the dlog package) about adding a systemd backend for dlog. He advised against that with the following motivation:

Modern daemons on systemd GNU/Linux systems shouldn't really be hitting syslog, they should just print loglevel-tagged messages to stderr. See item 10 at https://www.freedesktop.org/software/systemd/man/daemon.html#New-Style%20Daemons and https://pkg.go.dev/git.lukeshu.com/go/libsystemd/sd_daemon#Logger

I agree with that, and I think adding the ability to log to stderr when running as a proper daemon is the way to go. Perhaps add special recognition of when stderr is passed as the first argument.

The main disadvantages of this user service are:

  • not synchronize with root service (systemd user services do not share the same space)

That's why we split the daemon in two. We don't want the to share more than what's absolutely necessary.

The main advantage with a user service is that Kubernetes, with its configuration, caches, and third-party authentication mechanisms (for GKE, AWS, Azure to name a few. They are often in a plugin-like form that spawns separate executables), runs with the users account. It would be very bad if a user was forced to configure such things using the root account. This is the one and only motivation for the split-daemon approach. We really don't want the connector daemon to run as root.

I'm intrigued by the systemd auto-updater. Perhaps we should expose the update checker that Telepresence has built-in so that it could be used as a trigger?

spnngl commented 1 year ago
  • telepresence root daemon does not work without the XDG_* envvar

It should (that's why we pass the logdir and config dir as arguments). What problem do you see when it isn't set?

Launching without XDG_CACHE_HOME gives this error:

telepresence[526070]: neither $XDG_CACHE_HOME nor $HOME are defined

I did not look in depth but the user counterpart needs access to this cache I think ? User daemon would not connect when root PrivateTmp was set (I use it as XDG_CACHE_HOME).

It seems removing XDG_CONFIG_HOME does not give error anymore.

  • maybe add a systemd.socket ? multiple ones for multiple connections ?

There can never be multiple connections. A connection is ultimately what controls the TUN device and the daemon will never have more than one session running.

Yes, I thought of multiples TUN device. One per cluster.

  • logs in systemd-journald

I did look into this a while back, also had a discussion with @LukeShu (the author of the dlog package) about adding a systemd backend for dlog. He advised against that with the following motivation:

Modern daemons on systemd GNU/Linux systems shouldn't really be hitting syslog, they should just print loglevel-tagged messages to stderr. See item 10 at https://www.freedesktop.org/software/systemd/man/daemon.html#New-Style%20Daemons and https://pkg.go.dev/git.lukeshu.com/go/libsystemd/sd_daemon#Logger

I agree with that, and I think adding the ability to log to stderr when running as a proper daemon is the way to go. Perhaps add special recognition of when stderr is passed as the first argument.

I agree too. I thought of an option to force log to stdout/stderr, telepresence already do it when we launch it in a TTY.

The main disadvantages of this user service are:

  • not synchronize with root service (systemd user services do not share the same space)

That's why we split the daemon in two. We don't want the to share more than what's absolutely necessary.

The main advantage with a user service is that Kubernetes, with its configuration, caches, and third-party authentication mechanisms (for GKE, AWS, Azure to name a few. They are often in a plugin-like form that spawns separate executables), runs with the users account. It would be very bad if a user was forced to configure such things using the root account. This is the one and only motivation for the split-daemon approach. We really don't want the connector daemon to run as root.

I understand, I think you're right (did not find any good alternatives). We have to be careful nonetheless, the lack of synchronization can generate weird state/behaviour.

I'm intrigued by the systemd auto-updater. Perhaps we should expose the update checker that Telepresence has built-in so that it could be used as a trigger?

It would be great !

spnngl commented 1 year ago

A note I'd like to share, someone in my team will likely do something similar for launchd (macos)

desaintmartin commented 6 months ago

Having an automated way to manage telepresence would be great instead of starting / stopping it each time we change network and/or put machine to sleep.

thallgren commented 6 months ago

For people interested in this feature. I'd recommend that you make a contribution in the form of a PR where the needed files are added in a new folder. Include a README.md in that folder explaining how to set things up.

Ping @desaintmartin @spnngl

github-actions[bot] commented 4 weeks ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment, or this will be closed in 7 days.