Closed treeshateorcs closed 3 years ago
I can't reproduce that currently, and also don't have an idea why that might happen. The thing ist, pam-gnupg sends the passphrase from pam_setcred
or pam_open_session
, both of which are only called after successful authentication. So failed attempts should in princple never reach the agent.
I don't know anything about systemd-homed, but your log looks as if you're using an encrypted home partition. Can you maybe outline your setup, and maybe also post the relevant files from /etc/pam.d
, probably login
and everything inlucded from there?
Also, the line
Nov 15 03:55:13 unix gpg-agent[16766]: Failed to lookup password for key n/8C07AF9F4560150577D103DE2B50D1389B3891AF with secret service: The name org.freedesktop.secrets was not provided by any .service files
is a bit unusual. Far as I know gpg-agent does not use org.freedesktop.secret. Do you have an idea where that might come from? Do you use gnome-keyring or kwallet?
Finally, in case you haven't already done that, can you try adding the store-only
option to auth
, i.e.
auth optional pam_gnupg.so store-only
That disables sending the passphrase from auth
at all, deferring that to the session
module. Maybe the issue is related to sending the passphrase too early (wrt. your encrypted home for example).
i use systemd-homed, and it encrypts the home folder using luks by default. i'm on archlinux, (i think it's the only major distro, as of now, that supports systemd-homed)
here's my system-local-login
#%PAM-1.0
auth include system-login
account include system-login
password include system-login
session include system-login
auth optional pam_gnupg.so store-only
session optional pam_gnupg.so
and swaylock
(just in case)
#%PAM-1.0
#
# PAM configuration file for the swaylock screen locker. By default, it includes
# the 'login' configuration file (see /etc/pam.d/login)
#
auth include login
auth optional pam_gnupg.so
and login
#%PAM-1.0
auth required pam_securetty.so
auth requisite pam_nologin.so
auth include system-local-login
account include system-local-login
session include system-local-login
these files are the only ones i changed (according to your README)
this line is at the end (before the line that is responsible for running sway on tty1) of my .zprofile
export SSH_AUTH_SOCK=$(gpgconf --list-dirs agent-ssh-socket)
(but it doesn't work, though that's a different issue)
and my keygrip in ~/.pam-gnupg
that's pretty much all of my setup (unless i forget something)
i used gnome-keyring previously, and i set some things up so that it could unlock the gpg keyring on login, but i got rid of everything of that (i think, unless i forget something)
and no store-only
didn't have any effect, it still asks for my password after at least one failed login attempt
I pushed a commit that adds some basic debug logging, activated by a new module option debug
. Can you set that for all instances of pam_gnupg.so
, try again and post the relevant journal output of both login attempts?
i added debug to the pam_gnupg lines, so that system-local-login
looks like this
#%PAM-1.0
auth include system-login
account include system-login
password include system-login
session include system-login
auth optional pam_gnupg.so debug
session optional pam_gnupg.so debug
and swaylock like this (even though i don't have problems with it)
auth include login
auth optional pam_gnupg.so debug
and here's the log
i rebooted, and there's a longer log, with more occurences of pam_gnupg
. maybe it will be more helpful
The logs look ok. The only log entries from the first attempt seem to be (from your second logfile)
Nov 18 01:56:11 unix systemd-homed[407]: tho: changing state inactive → activating-for-acquire
Nov 18 01:56:11 unix systemd-homework[1311]: None of the supplied plaintext passwords unlocks the user record's hashed passwords.
Nov 18 01:56:14 unix systemd-homed[407]: Activation failed: Required key not available
Nov 18 01:56:14 unix systemd-homed[407]: tho: changing state activating-for-acquire → inactive
Nov 18 01:56:14 unix systemd-homed[407]: Got notification that all sessions of user tho ended, deactivating automatically.
Nov 18 01:56:14 unix systemd-homed[407]: Home tho already deactivated, no automatic deactivation needed.
Apparently, pam-gnupg is not even executed in this case, so I don't understand how this failed attempt can influence the outcome of the second login.
On the second login, there's an error message from setcred,
Nov 18 01:56:34 unix login[664]: pam_gnupg(login:setcred): presetting failed, retaining passphrase
but that is fine since the agent is not yet running after reboot (it's not autostarted during setcred). The key is preset later when opening the session,
Nov 18 01:56:34 unix login[664]: pam_gnupg(login:session): presetting passphrase succeeded, cleaning up
followed by another setcred that fails since the passphrase has already been cleaned up,
Nov 18 01:56:34 unix login[664]: pam_gnupg(login:setcred): unable to obtain stored passphrase
so again this is fine. You can also see the agent being socket-activated by systemd right before pam-gnupg logs the success message.
Nov 18 01:56:34 unix gpg-agent[1356]: gpg-agent (GnuPG) 2.2.23 starting in supervised mode.
What's slightly strange is that the line
Nov 18 01:56:34 unix systemd[1341]: Listening on GnuPG cryptographic agent and passphrase cache.
saying that systemd listens on the agent socket actually comes after the agent has already started and pam-gnupg is done, but maybe that is just due to log messages arriving out of order.
More importantly, there are messages from gpg-agent[1356]
and gpg-agent[1392]
, the latter being the one that tells your spotifyd that it can't get the password. So at least briefly you seem to have multiple agents running. Do you maybe set GNUPGHOME
to something other than ~/.gnupg
, and if so, where do you set it?
To exclude that there are any kind of races between pam-gnupg and systemd autostarting the agent, can you maybe try adding no-autostart
to the session optional pam_gnupg.so
line? (This will only work if you did not set GNUPGHOME
, otherwise systemd will not listen on the correct socket.)
I don't expect that to be the reason, though, because that issue would not just occur after a failed login, but I also don't have any better idea currently.
Finally, there's and error message saying
Nov 18 01:56:35 unix gpg-agent[1356]: failed to unprotect the secret key: Required environment variable not set
I don't think this is really related to this issue, but you may need to add
gpg-connect-agent --no-autostart updatestartuptty /bye
to you xinitrc
for the pinentry to work.
thank you for the elaborate answer. i appended no-autostart
to one line in system-local-login
and also added
exec gpg-connect-agent --no-autostart updatestartuptty /bye
to my sway config
here's a log https://0x0.st/i505.txt
and no, i don't have GNUPGHOME
set. my gnupg home is the standard location (~/.gnupg
)
also, i'm not sure this is related. but i actually have a slight problem with my login process. it is too slow. i have to wait for about 10 seconds before i'm logged in, i tried to debug it, and what i found out was the high cpu usage of the systemd-homework process during a login. this is hardly helpful, but i felt like telling you
Well, the environment variable not set
error did not go away by updatestartuptty
, but like I said, it's probably unrelated.
I just noticed that your log, and actually also the previous one, shows 3 different agent instances: the one started by socket activation (1291), one started presumably by spotifyd (1329), and one for which I can't tell the cause from the log (1589). Do you have an idea where that might come from? Do you use pass
somewhere else during session startup?
Maybe spotifyd, or more precisely the pass
command that it spawns, can't talk to the agent and then autostarts one of its own that is then killed with the service's entire cgroup after spotifyd fails. In fact, in the first log you posted, the one where you did not reboot, only shows one additional agent (unclear whether it's the systemd service or the “third one”), and spotifyd seems to be fine there, since in that case the agent was already running. I think you may need to add a dependency After=gpg-agent.socket
to your spotifyd.service
.
On the other hand, that would mean that also this thing is not necessarily related to the issue. Moreover, if spotifyd actually did work in that case, that means that at least at some point some agent was able to provide the password. Or maybe spotifyd itself was already running and simply did not fail due to that. (There are just too many scenarios :-/ )
Could you try the following?
fuser $(gpgconf --list-dirs agent-socket)
and compare the PIDs to the log file to figure out which instance is the one pass
is talking to, and maybe also check if there are multiple agents running.fuser
and agent instances after a reboot when you log in on the first attempt. Maybe the difference can tell us what's wrong.gpg-agent
binary away and replace it by a script like
#!/bin/sh
date >> ~/agent.log
echo "PID: $$" >> ~/agent.log
echo "Parent: $PPID" >> ~/agent.log
ls -l /proc/$PPID/exe >> ~/agent.log
pstree -pT >> ~/agent.log
exec /path/to/real/gpg-agent "$@"
Maybe that can tell us how the instances are started.
i will do the rest in a minute
this is before reboot. (login on first try)
okay this after reboot (wrong password on first try
and agent.log
https://0x0.st/i55x.txt
I still don't understand this. There's only one agent instance, it's listening on the socket, and it's pretty clearly the one that was started by systemd after being triggered by pam-gnupg (you can see the pam_gnupg_helper in the pstree output).
Maybe let's check what messages the agent is actually receiving. Can you edit the gpg-agent unit (systemctl --user edit gpg-agent.service
) with the following override:
[Service]
ExecStart=
ExecStart=/usr/bin/gpg-agent --supervised --debug-level basic
StandardOutput=append:%h/agent.journal
This will write all agent traffic to ~/agent.journal
.
This file will contain your hex-encoded password and partial plaintexts of encrypted secrets, so do not post it without editing!
I did that on my notebook and edited out all potentially secret parts, ([..keygrip..]
, [..passphrase..]
, etc): https://0x0.st/i57P.journal
All I did was restart the service, log in (once) on another VT, and then decrypt something by invoking pass
(that's the PKDECRYPT
block at the bottom, where I edited out ciphertext and plaintext). The interesting part is the preset_passphrase
line, that's coming from pam-gnupg. The last argument in that line is your hex-encoded password. You can decode that with
echo -e "$(sed 's/../\\x\0/g')"
then paste the encoded password and hit Ctrl-D
.
Can you check that the preset command arrives, contains the correct password, try decrypting something to see what turns up in the log, and maybe post an edited version of the log if you're comfortable with that?
Finally, a small remark that I never noticed until now: using pass
from systemd services like spotifyd has an inherent racing condition even when you declare After=gpg-agent.socket
. The reason is that pam_gnupg.so
runs (and must run) after pam_systemd.so
, which does not know about pam-gnupg and will therefore start spotifyd as soon as the socket is up, but potentially before the passphrase has been preset (you actually see spotifyd already running in you agent.log
). There is no simple, clean way to solve that; the only option I can think of is to introduce another service that can be placed between the socket and spotifyd, waits for some appropriate signal from pam-gnupg and only then signals successful startup to systemd. Maybe I'll implement something like that if I can come up with a sufficiently simple way to do that, but for the time being, you should probably add something like
ExecStartPre=/bin/sleep 5
to spotifyd.service
.
the decoded password is only partly correct. it's only the first 12 symbols of my password
https://0x0.st/i5hm.txt this is edited agent.journal
if you are interested
when i successfully login after the first try the decoded passphrase is correct
i just tried a "failed" login again and i got 15 symbols of my passphrase this time
sorry for the spam, i thought this was important
Well, I still don't understand, but at least we're getting somewhere. I added some debug print statements in the debug
branch. Can you build that and try again (./configure && make && sudo make install
after uninstalling the AUR package)? It should print messages directly to your console, detailing how many bytes of what tokens where received and sent by which part of the code. It will print your password in plain text. Can you tell from that at which point the password gets truncated?
okay so i think i've figured it out
do you see it?? :D
Ohhhh, the “Sorry, try again”-Prompt is not from login
, but from systemd-homed which fails with your initial password, then asks again by itself and then does not hand the passphrase down to us. We're still using the first password.
I also found it in the systemd code: https://github.com/systemd/systemd/blob/07ccf434e77b17b20d773ce8b2637083cd4fdafc/src/home/pam_systemd_home.c#L310
There's no place in the systemd codebase where the pam auth token is set, so I think it will be impossible to obtain it this way. It does set some auxiliary pam data items (systemd-home-user-record-$USER
and systemd-user-record-$USER
), but these JSON User Records only contain the hashed passphrase.
I'm afraid it may be impossible to solve this issue. I'll have to look how gnome-keyring handles this, they probably encounter the same problem, and definitely know better than me how homed works.
Anyway, at least I'm glad we figured this out. It was starting to drive me insane.
Well, gnome-keyring doesn't do any better apparently: https://gitlab.gnome.org/GNOME/gnome-keyring/-/issues/59#note_914341
Edit: systemd also has an issue that seems to be related: https://github.com/systemd/systemd/issues/16814
thank you for your work!
Systemd 247, which just reached the arch repo, contains a fix. Can you test if it works for you?
yes, i just rebooted, this problem has been fixed with the release of systemd 247
e.g. if you enter your password once, and it was wrong, then you enter again and it is correct
this may have to do with systemd-homed or that i don't use any display managers
here's an excerpt from my journalctl
https://0x0.st/i5KN.txt