chimera-linux / turnstile

Independent session/login tracker
BSD 2-Clause "Simplified" License
89 stars 8 forks source link

Unable to login if error in .profile #10

Closed wezm closed 1 year ago

wezm commented 1 year ago

I wasn't really sure where to open this since there's a bunch of interrelated parts and it's arguably user error but I figured it was worth documenting/opening for discussion.

Background

Early after installing Chimera in July I checked to see if rustup worked (it does but obviously none of the toolchains it installs do). I was still running the default shell and after it installed it added the following to ~/.profile:

. "$HOME/.cargo/env"

(rustup creates $HOME/.cargo/env file when it is installed)

The issue

This morning I deleted ~/.cargo and after that I could no longer login (gdm just hung). The turnstile syslogs showed:

2023-08-25 00:20:22 debug turnstiled: conn: accepted 11 for 8
2023-08-25 00:20:22 debug turnstiled: turnstiled: poll
2023-08-25 00:20:22 debug turnstiled: msg: read 5 (11)
2023-08-25 00:20:22 debug turnstiled: msg: welcome 1000
2023-08-25 00:20:22 debug turnstiled: msg: repopulate login 1000
2023-08-25 00:20:22 debug turnstiled: msg: new session for 1000/11
2023-08-25 00:20:22 debug turnstiled: msg: start service manager
2023-08-25 00:20:22 debug turnstiled: srv: setup rundir for 1000
2023-08-25 00:20:22 debug turnstiled: rundir: make directory /run/user/1000
2023-08-25 00:20:22 debug turnstiled: rundir: try make parent /run
2023-08-25 00:20:22 debug turnstiled: rundir: try make parent /run/user
2023-08-25 00:20:22 debug turnstiled: srv: create login dir for 1000
2023-08-25 00:20:22 debug turnstiled: srv: create readiness pipe
2023-08-25 00:20:22 debug turnstiled: srv: timer set
2023-08-25 00:20:22 debug turnstiled: srv: launch
2023-08-25 00:20:22 debug turnstiled: msg: wait
2023-08-25 00:20:22 debug turnstiled: turnstiled: poll
2023-08-25 00:20:22 debug turnstiled: turnstiled: poll
2023-08-25 00:20:22 debug turnstiled: turnstiled: sigchld
2023-08-25 00:20:22 debug turnstiled: srv: reap 4483
2023-08-25 00:20:22 err turnstiled: srv: died without notifying readiness
2023-08-25 00:20:22 debug turnstiled: rundir: clear directory /run/user/1000
2023-08-25 00:20:22 debug turnstiled: turnstiled: drop login 1000
2023-08-25 00:20:22 debug turnstiled: conn: close 11 for login 1000
2023-08-25 00:20:22 debug turnstiled: srv: stop
2023-08-25 00:20:22 debug turnstiled: dir_clear: clear srv.4484 at 12
2023-08-25 00:20:22 debug turnstiled: turnstiled: poll

turnstiled.log:

[  OK  ] system
Service 'boot' started.
[  OK  ] boot
.: cannot open /home/wmoore/.cargo/env: No such file or directory
[  OK  ] dbus
Service 'boot' started.
[  OK  ] system
[  OK  ] boot
[STOPPD] boot
[STOPPD] system
[STOPPD] boot
[STOPPD] system
[STOPPD] dbus
[STOPPD] dbus

It took me a while to work out where that .: cannot open /home/wmoore/.cargo/env: No such file or directory error was coming from. I think it is:

https://github.com/chimera-linux/chimerautils/blob/bfe845fe863f9aa5f6b550df3d2c9ff92211495b/src.freebsd/sh/input.c#L370

So I think what is happening is that /bin/sh exits due to trying to source a missing file, which I think makes the dinit backend script exit or maybe it's something dinit spawns that fails since there appears to be some dinit output in the turnstile log.

Anyway as I said perhaps this can be chalked up to user error so feel free to close. However, it was quite difficult to map the cause to the effect. Perhaps worth noting that it appears that bash does not exit if it fails to source a file (it outputs an error but keeps going), which I think means that if this scenario were encountered with bash providing /bin/sh I perhaps would have still been able to log in. I don't mention this to suggest bash should be used or anything, just to highlight a difference with the BSD sh.

Resolution

I was able to regain the ability to login by creating an empty file at ~/.cargo/env. Once I worked out what was going on I was also able to remove the offending line from ~/.profile.

q66 commented 1 year ago

turnstile will not prevent a login when there is an error in the service (the login will still proceed after the failure); if you can't log in on tty, it's more like something like your login shell failing to run

also fwiw, freebsd sh also does not exit when sourcing a file fails; it just prints a message and sets $?

q66 commented 1 year ago

that said, it's weird that turnstile comes down at all (i feel like that message is not the source of it though, the only login-shell-like thing turnstile spawns is the backend script itself and since dinit executes, it's clearly not the problem), so i'll check the actual behavior on my side before closing this

NovaAndrom3da commented 1 year ago

Can confirm, issue is reproducible, and I've fallen victim to it as well.

q66 commented 1 year ago

This turned out to be a combination of two issues:

1) The PAM module gets stuck in a loop when the peer closes the connection, as it did not handle the case of receiving zero bytes (which means an EOF when using a blocking connection like that) so it'd keep trying to receive more bytes forever (4cd08b1d0795d41e01c3b6cbc1f96160e0dd481a) 2) It turns out some shells, including ash and anything derived from it, die (with code 127) when sourcing a non-existent file in a non-interactive script, with no way around it, or at least no way to make it work equivalently to an interactive shell; after some thinking I decided to remove support for that (i.e. no shell profile sourcing; 6efe758a12406e2b35e783b12d89d827b3dbc44b)

i will close this now and roll a new release after a bit of testing

wezm commented 1 year ago

Thanks for looking into it. 👍