nix-community / lorri

Your project’s nix-env [maintainer=@Profpatsch,@nyarly]
Apache License 2.0
638 stars 24 forks source link

lorri daemon leaks file descriptors with stream-events --kind snapshots #35

Closed symphorien closed 3 years ago

symphorien commented 3 years ago

Describe the bug running lorri internal stream-events --kind snapshot leaks 2 fds in the daemon.

To Reproduce Steps to reproduce the behavior:

  1. run lorri daemon
  2. in another shell, run while sleep 1; do lorri internal stream-events --kind snapshot; done
  3. in another shell run watch 'lsof -p 1234 | wc -l' where 1234 is the pid of lorri daemon
  4. watch as this number increases by 2 every second

Expected behavior the number of file descriptor is more or less constant

Metadata

$  lorri info --shell-file shell.nix
lorri version: 1.4
GC roots exist, shell_gc_root: /home/symphorien/.cache/lorri/gc_roots/07d6e569120731e7ee9a3297613c71b7/gc_root/shell_gc_root
$  uname -a
Linux bete 5.10.25 #1-NixOS SMP Sat Mar 20 09:43:44 UTC 2021 x86_64 GNU/Linux

revision: current canon: 4fb9199c5d205e18a62442b857cc2f19207fbb8e

Profpatsch commented 3 years ago

I can confirm, running

for i in $(seq 1 100); do
  lorri internal stream-events --kind snapshot
done

and then checking the lorri daemon process with

lsof <lorri daemon pid>

shows that 100 fds to the socket are open, so incoming fds are not dropped. I think it might also be the case that handler threads are still running, checking.

Profpatsch commented 3 years ago

Update: the same does not happen with lorri ping.

If you run lorri daemon -v it shows the reason:

Apr 03 10:24:57.030 DEBG client vanished, error: Serialize(Io(Os { code: 32, kind: BrokenPipe, message: "Broken pipe" })), communication_type: StreamEvents
Apr 03 10:24:57.030 DEBG Sent new listener sectionend, keep: true

The handler panics (because the code was badly written), and thus the drop function of the file handle is not run, leaving the file open.

Profpatsch commented 3 years ago

It wasn’t actually a panic, just a mishandled error condition which didn’t stop the worker thread:

https://github.com/nix-community/lorri/blob/4fb9199c5d205e18a62442b857cc2f19207fbb8e/src/daemon/server.rs#L89-L94

you can see that the handler continues in the loop, even though the client has vanished.