basecamp / thruster

MIT License
672 stars 16 forks source link

Socket activation crashes Thruster #35

Open airblade opened 2 weeks ago

airblade commented 2 weeks ago

Hello!

Usually I deploy Puma behind Caddy, communicating over a socket and using systemd's socket activation. This allows graceful, zero-downtime restarts when deploying a new version of the code.

I'm trying to replace Caddy with Thruster. Ideally I'd like to keep the socket activation so I can have graceful restarts. (I'm not using containers or Kamal; instead I git-push to my production server where a git hook reloads/restarts everything.)

However I can't quite get it to work. My question is: is it actually possible for Thruster to accept a socket from systemd? If not, I'll stop :)

This is what I've got so far:

/etc/systemd/system/puma.service ``` [Unit] Description=Puma HTTP Server After=network.target Requires=puma.socket [Service] Type=notify NotifyAccess=all WatchdogSec=10 User=deploy Group=deploy WorkingDirectory=/var/www/fooapp ExecStart=/var/www/fooapp/bin/thrust /var/www/fooapp/bin/rails server Restart=always Environment=MALLOC_ARENA_MAX=2 Environment=RAILS_MASTER_KEY=... Environment=RAILS_ENV=production Environment=RACK_ENV=production Environment=WEB_CONCURRENCY=2 Environment=RAILS_MAX_THREADS=3 Environment=PUMA_MAX_THREADS=3 Environment=TLS_DOMAIN=fooapp.com StandardOutput=append:/var/www/fooapp/log/rails-out.log StandardError=append:/var/www/fooapp/log/rails-err.log # This will default to "bash" if we don't specify it SyslogIdentifier=puma [Install] WantedBy=multi-user.target ```
/etc/systemd/system/puma.socket ``` [Unit] Description=Puma HTTP Server Accept Sockets [Socket] ListenStream=443 # Socket options matching Puma defaults NoDelay=true ReusePort=true Backlog=1024 [Install] WantedBy=sockets.target ```

The puma.socket seems to start up fine but puma.service doesn't.

The stdout log looks normal but there's a big stacktrace in stderr:

SIGABRT: abort
PC=0x403bec m=0 sigcode=0

goroutine 1 gp=0xc0000061c0 m=0 mp=0xaf3b80 [syscall]:
syscall.Syscall6(0xf7, 0x1, 0x144bf, 0xc0000f1c08, 0x1000004, 0x0, 0x0)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/syscall/syscall_linux.go:91 +0x39 fp=0xc0000f1bd0 sp=0xc0000f1b70 pc=0x4ad399
os.(*Process).blockUntilWaitable(0xc000028240)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/os/wait_waitid.go:32 +0x76 fp=0xc0000f1ca8 sp=0xc0000f1bd0 pc=0x4cc1f6
os.(*Process).wait(0xc000028240)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/os/exec_unix.go:22 +0x25 fp=0xc0000f1d08 sp=0xc0000f1ca8 pc=0x4c7f45
os.(*Process).Wait(...)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/os/exec.go:134
os/exec.(*Cmd).Wait(0xc0000e6000)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/os/exec/exec.go:897 +0x45 fp=0xc0000f1d68 sp=0xc0000f1d08 pc=0x7446c5
github.com/basecamp/thruster/internal.(*UpstreamProcess).Run(0xc000032f30)
    /Users/kevin/Work/basecamp/thruster/internal/upstream_process.go:37 +0x11a fp=0xc0000f1db0 sp=0xc0000f1d68 pc=0x74c55a
github.com/basecamp/thruster/internal.(*Service).Run(0xc0000f1f28)
    /Users/kevin/Work/basecamp/thruster/internal/service.go:38 +0x325 fp=0xc0000f1ee8 sp=0xc0000f1db0 pc=0x74c265
main.main()
    /Users/kevin/Work/basecamp/thruster/cmd/thrust/main.go:25 +0xa5 fp=0xc0000f1f50 sp=0xc0000f1ee8 pc=0x74d4e5
runtime.main()
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/proc.go:271 +0x29d fp=0xc0000f1fe0 sp=0xc0000f1f50 pc=0x43ac5d
runtime.goexit({})
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000f1fe8 sp=0xc0000f1fe0 pc=0x46d5c1

goroutine 2 gp=0xc000006700 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/proc.go:402 +0xce fp=0xc00004efa8 sp=0xc00004ef88 pc=0x43b08e
runtime.goparkunlock(...)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/proc.go:408
runtime.forcegchelper()
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/proc.go:326 +0xb3 fp=0xc00004efe0 sp=0xc00004efa8 pc=0x43af13
runtime.goexit({})
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00004efe8 sp=0xc00004efe0 pc=0x46d5c1
created by runtime.init.6 in goroutine 1
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/proc.go:314 +0x1a

goroutine 3 gp=0xc000006c40 m=nil [GC sweep wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/proc.go:402 +0xce fp=0xc00004f780 sp=0xc00004f760 pc=0x43b08e
runtime.goparkunlock(...)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/proc.go:408
runtime.bgsweep(0xc000026070)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/mgcsweep.go:278 +0x94 fp=0xc00004f7c8 sp=0xc00004f780 pc=0x426834
runtime.gcenable.gowrap1()
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/mgc.go:203 +0x25 fp=0xc00004f7e0 sp=0xc00004f7c8 pc=0x41b185
runtime.goexit({})
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00004f7e8 sp=0xc00004f7e0 pc=0x46d5c1
created by runtime.gcenable in goroutine 1
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/mgc.go:203 +0x66

goroutine 4 gp=0xc000006e00 m=nil [GC scavenge wait]:
runtime.gopark(0xc000026070?, 0x88c410?, 0x1?, 0x0?, 0xc000006e00?)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/proc.go:402 +0xce fp=0xc00004ff78 sp=0xc00004ff58 pc=0x43b08e
runtime.goparkunlock(...)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/proc.go:408
runtime.(*scavengerState).park(0xaf29a0)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc00004ffa8 sp=0xc00004ff78 pc=0x424229
runtime.bgscavenge(0xc000026070)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/mgcscavenge.go:653 +0x3c fp=0xc00004ffc8 sp=0xc00004ffa8 pc=0x4247bc
runtime.gcenable.gowrap2()
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/mgc.go:204 +0x25 fp=0xc00004ffe0 sp=0xc00004ffc8 pc=0x41b125
runtime.goexit({})
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00004ffe8 sp=0xc00004ffe0 pc=0x46d5c1
created by runtime.gcenable in goroutine 1
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/mgc.go:204 +0xa5

goroutine 5 gp=0xc000007340 m=nil [finalizer wait]:
runtime.gopark(0xc00004e660?, 0x4236fc?, 0x60?, 0x51?, 0x550011?)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/proc.go:402 +0xce fp=0xc00004e620 sp=0xc00004e600 pc=0x43b08e
runtime.runfinq()
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/mfinal.go:194 +0x107 fp=0xc00004e7e0 sp=0xc00004e620 pc=0x41a1c7
runtime.goexit({})
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00004e7e8 sp=0xc00004e7e0 pc=0x46d5c1
created by runtime.createfing in goroutine 1
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/mfinal.go:164 +0x3d

goroutine 18 gp=0xc000007500 m=3 mp=0xc000055008 [syscall]:
runtime.notetsleepg(0xb54600, 0xffffffffffffffff)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/lock_futex.go:246 +0x29 fp=0xc0000507a0 sp=0xc000050778 pc=0x40ce89
os/signal.signal_recv()
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/sigqueue.go:152 +0x29 fp=0xc0000507c0 sp=0xc0000507a0 pc=0x46a229
os/signal.loop()
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/os/signal/signal_unix.go:23 +0x13 fp=0xc0000507e0 sp=0xc0000507c0 pc=0x746533
runtime.goexit({})
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000507e8 sp=0xc0000507e0 pc=0x46d5c1
created by os/signal.Notify.func1.1 in goroutine 8
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/os/signal/signal.go:151 +0x1f

goroutine 17 gp=0xc0000076c0 m=nil [select, locked to thread]:
runtime.gopark(0xc000050fa8?, 0x2?, 0x29?, 0xb3?, 0xc000050f94?)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/proc.go:402 +0xce fp=0xc000050e38 sp=0xc000050e18 pc=0x43b08e
runtime.selectgo(0xc000050fa8, 0xc000050f90, 0x0?, 0x0, 0x0?, 0x1)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/select.go:327 +0x725 fp=0xc000050f58 sp=0xc000050e38 pc=0x44c445
runtime.ensureSigM.func1()
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/signal_unix.go:1034 +0x19f fp=0xc000050fe0 sp=0xc000050f58 pc=0x46501f
runtime.goexit({})
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000050fe8 sp=0xc000050fe0 pc=0x46d5c1
created by runtime.ensureSigM in goroutine 8
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/signal_unix.go:1017 +0xc8

goroutine 8 gp=0xc000007880 m=nil [chan receive]:
runtime.gopark(0x746145?, 0x79b6c0?, 0x1?, 0x1?, 0xc0000516f8?)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/proc.go:402 +0xce fp=0xc000051668 sp=0xc000051648 pc=0x43b08e
runtime.chanrecv(0xc00010e060, 0xc000051788, 0x1)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/chan.go:583 +0x3bf fp=0xc0000516e0 sp=0xc000051668 pc=0x40719f
runtime.chanrecv1(0xc00010e060?, 0xc000051798?)
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/chan.go:442 +0x12 fp=0xc000051708 sp=0xc0000516e0 pc=0x406dd2
github.com/basecamp/thruster/internal.(*UpstreamProcess).handleSignals(0xc000032f30)
    /Users/kevin/Work/basecamp/thruster/internal/upstream_process.go:55 +0xaf fp=0xc0000517c8 sp=0xc000051708 pc=0x74c72f
github.com/basecamp/thruster/internal.(*UpstreamProcess).Run.gowrap1()
    /Users/kevin/Work/basecamp/thruster/internal/upstream_process.go:36 +0x25 fp=0xc0000517e0 sp=0xc0000517c8 pc=0x74c645
runtime.goexit({})
    /opt/homebrew/Cellar/go/1.22.2/libexec/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000517e8 sp=0xc0000517e0 pc=0x46d5c1
created by github.com/basecamp/thruster/internal.(*UpstreamProcess).Run in goroutine 1
    /Users/kevin/Work/basecamp/thruster/internal/upstream_process.go:36 +0x10c

rax    0xf7
rbx    0x1
rcx    0x403bee
rdx    0xc0000f1c08
rdi    0x1
rsi    0x144bf
rbp    0xc0000f1b60
rsp    0xc0000f1b20
r8     0x0
r9     0x0
r10    0x1000004
r11    0x216
r12    0xc0000f1c50
r13    0x0
r14    0xc0000061c0
r15    0x1ffffffffffff
rip    0x403bec
rflags 0x216
cs     0x33
fs     0x0
gs     0x0

This is all on Ubuntu 24.04 LTS with Thruster 0.1.4, Puma 6.4.2 and Ruby 3.3.3.

airblade commented 2 weeks ago

The last line of the stacktrace points here:

https://github.com/basecamp/thruster/blob/ec8c5f6b9b06425dbffc4edb43361e48f0afe8da/internal/upstream_process.go#L36

Hmm, maybe this is caused by Type=notify instead of Type=simple? I don't know.

airblade commented 2 weeks ago

I'd like to keep the socket activation so I can have graceful restarts.

The other benefit is of socket activation is binding to a privileged port (which Caddy was taking care of previously).

If Thruster isn't compatible with socket activation, how I can run it with systemd without socket activation?

I tried to run it without socket activation (by commenting out Requires=puma.socket in my service file and removing the puma.socket file) but, although it successfully starts, it doesn't bind Thruster to 443.

airblade commented 2 weeks ago

Aha! Adding this to puma.service allows Thruster to bind to 443:

AmbientCapabilities=CAP_NET_BIND_SERVICE

OK, so I can run Thruster via systemd without socket activation. Yay!

Will Thruster accept an active socket from systemd? No worries if not. (I don't know Go but as far as I can tell from the code, it won't.)

airblade commented 2 weeks ago

Yesterday I was trying to use socket activation with port 443, which I couldn't get to work.

Today I thought maybe it makes more sense to give Puma an activated socket. So I have been trying to use 3000 as an activated socket. This required getting Thruster to call bin/puma instead of bin/rails server so I could pass it --bind-to-activated-sockets.

puma.socket ``` [Socket] ListenStream = 0.0.0.0:3000 [Install] Wantedby = sockets.target ```
puma.service ``` ... ExecStart=/var/www/fooapp/bin/thrust /var/www/fooapp/bin/puma -p 3000 # I also tried: #ExecStart=/var/www/fooapp/bin/thrust /var/www/fooapp/bin/puma --bind-to-activated-sockets ... ```

However I always get Address already in use - bind(2) for "0.0.0.0" port 3000.

Puma's socket activation docs say:

Any wrapper scripts which exec, or other indirections in ExecStart may result in activated socket file descriptors being closed before reaching the puma master process.

Does this apply to Thruster? I think it does an exec but my Go's not good enough to be sure.