Closed GregorZattler closed 6 years ago
It happened again, I had to reboot the computer. Perhaps it's helpful to note that it happened in both cases where it freezed, after the computer came out of suspend. In both cases + 1h later I wanted to unlock it and it was frozen.
Thanks for looking into this.
Are you able to unlock physlock's password prompt under normal circumstances?
Physlock uses PAM for authentication and the error generated from the PAM module pam_unix
indicates that it cannot authenticate the users named user
and root
because their passwords are null. Waking your system with rtcwake
seems to have some impact on your PAM config.
Are you able to unlock physlock's password prompt under normal circumstances?
Yes, several times a day.
Waking your system with rtcwake seems to have some impact on your PAM config.
OK, but why are there thousands of physlock attempts to get authetification per second in the first place?
My problem is the million attempts to unlock, which obviously are not the result of a human interacting with the computer. It doesn't bother me, that these millions of attempts to unlock failed. The very first and basic problem is hard disk space. ATM:
du -shc /var/log/auth.log* 12G /var/log/auth.log 12G /var/log/auth.log.1 21M /var/log/auth.log.2.gz 46M /var/log/auth.log.3.gz 128K /var/log/auth.log.4.gz 140K /var/log/auth.log.5.gz 23G total
This is on a otherwise heavily used laptop SSD of only 128 GB.
Normally, meaning if I type in passwords, physlock will slow down the appearence of the next password prompt if I enter a wrong password. But this does not apply here.
Can you please paste the contents of the files /etc/pam.d/physlock
(very likely does not exist) and /etc/pam.d/other
?
Hi Bert,
Can you please paste the contents of the files
/etc/pam.d/physlock
(very likely does not exist)
You are right, it does not exist.
and
/etc/pam.d/other
?
May be the comments at the top will be shown as headings:
$ cat /etc/pam.d/other #
#
#
@include common-auth @include common-account @include common-password @include common-session
Ciao; Gregor -- -... --- .-. . -.. ..--.. ...-.-
Can you please also post /etc/pam.d/common-auth
?
Hi Bert,
Can you please also post
/etc/pam.d/common-auth
?
sure, here it is: #
#
#
auth [success=1 default=ignore] pam_unix.so nullok_secure
auth requisite pam_deny.so
auth required pam_permit.so
auth optional pam_cap.so
Thanks for investigating this. Gregor
There's nothing unusual in your pam config that would explain such intermittent failures. What was displayed on the console when you were unable to unlock it?
I've revised the PAM error handling in order to prevent the excessive logging. However, this does not address the inability to unlock the computer in such a failure case.
Hi Bert,
There's nothing unusual in your pam config that would explain such intermittent failures. What was displayed on the console when you were unable to unlock it?
I don't know any more. It didn't seem strange to me. There are no relevant lines in my /var/log/messages -file...
I've revised the PAM error handling in order to prevent the excessive logging.
Thanks a bundle. This helps with my main problem.
However, this does not address the inability to unlock the computer in such a failure case.
It was'nt in all cases, that I was unable to unlock, these cases were some kind of freeze. OTOH there was one account of exessive logging while I was working at the computer so it was surely unlocked.
Thanks again, two minutes ago I pulled your latest commits and built physlock.
Ciao; Gregor
Closing it as there‘s nothing more for me to do. Feel free to reopen in case you have new information regarding the PAM errors.
Hi, it's me again. Thanks for physlock and the fix "in order to prevent the excessive logging".
This worked till Oct 14th. Now auth.log weighs 19 Gigabytes:
$ ls -l auth.log
-rw-r----- 1 root adm 19623837273 Okt 18 20:23 auth.log
This is with $ physlock -v physlock v11-35-g15744f5
I checked that it is this version which is used in my scripts.
Thanks for looking into this. Have a nice day, Gregor
I have no idea what's causing this. Are the log messages the same messages you've posted in your initial report?
Hi Bert,
I have no idea what's causing this. Are the log messages the same messages you've posted in your initial report?
yes. It this is from auth.log:
Oct 14 20:18:01 len CRON[7979]: pam_unix(cron:session): session opened for user xxxx by (uid=0) Oct 14 20:18:07 len CRON[7979]: pam_unix(cron:session): session closed for user xxxx Oct 14 20:18:29 len sudo: root : TTY=pts/4 ; PWD=/home/xxxx ; USER=root ; COMMAND=/sbin/nbd-client -l xxx Oct 14 20:18:29 len sudo: pam_unix(sudo:session): session opened for user root by xxxx(uid=0) Oct 14 20:18:29 len sudo: pam_unix(sudo:session): session closed for user root Oct 14 20:18:32 len physlock: pam_unix(physlock:auth): auth could not identify password for [xxxx] Oct 14 20:18:32 len physlock: pam_unix(physlock:auth): auth could not identify password for [xxxx] Oct 14 20:18:32 len physlock: pam_unix(physlock:auth): auth could not identify password for [xxxx] Oct 14 20:18:32 len physlock: pam_unix(physlock:auth): auth could not identify password for [root] Oct 14 20:18:32 len physlock: pam_unix(physlock:auth): auth could not identify password for [xxxx] Oct 14 20:18:32 len physlock: pam_unix(physlock:auth): auth could not identify password for [xxxx] ...
this is from syslog:
Oct 14 20:18:10 len spamd[1035]: spamd: handled cleanup of child pid [8578] due to SIGCHLD: interrupted, signal 2 (0002) Oct 14 20:18:29 len kernel: [ 7253.752869] [UFW BLOCK] IN=enp0s25 OUT= MAC=01:00:5e:00:00:01:c8:0c:c8:36:9c:d8:08:00 SRC=192.168.1.10 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0x00 TTL=1 ID=0 DF PROTO=2 Oct 14 20:18:32 len systemd[1]: Mounting /mnt/xxx-backup... Oct 14 20:18:32 len systemd[1]: Starting Cryptography Setup for xxx-backup... Oct 14 20:18:32 len systemd[1]: mnt-xxx\x2dbackup.mount: Mount process exited, code=exited status=32 Oct 14 20:18:32 len systemd[1]: Failed to mount /mnt/xxx-backup. Oct 14 20:18:32 len systemd[1]: mnt-xxx\x2dbackup.mount: Unit entered failed state. Oct 14 20:18:32 len systemd-cryptsetup[8760]: Set cipher aes, mode xts-plain64, key size 256 bits for device /dev/nbd0. Oct 14 20:18:32 len kernel: [ 7256.591788] EXT4-fs (dm-4): Unrecognized mount option "relatime" or missing value Oct 14 20:18:32 len kernel: [ 7256.592935] EXT4-fs (dm-4): failed to parse options in superblock: relatime Oct 14 20:18:32 len kernel: [ 7256.706789] EXT4-fs (dm-4): mounted filesystem with journalled data mode. Opts: relatime; user_xattr,errors=remount-ro,barrier=1 Oct 14 20:18:32 len systemd-cryptsetup[8760]: Failed to activate with key file '/etc/backup3.key': Device or resource busy Oct 14 20:18:32 len systemd[1]: Started Dispatch Password Requests to Console. Oct 14 20:18:32 len systemd[1]: Starting Forward Password Requests to Wall... Oct 14 20:18:32 len systemd[1]: Stopped Dispatch Password Requests to Console Directory Watch. Oct 14 20:18:32 len systemd[1]: Stopping Dispatch Password Requests to Console... Oct 14 20:18:32 len systemd[1]: Stopped Dispatch Password Requests to Console. Oct 14 20:18:32 len systemctl[8779]: Failed to stop systemd-ask-password-plymouth.path: Unit systemd-ask-password-plymouth.path not loaded. Oct 14 20:18:32 len systemctl[8779]: Failed to stop systemd-ask-password-plymouth.service: Unit systemd-ask-password-plymouth.service not loaded. Oct 14 20:18:32 len systemd[1]: Started Forward Password Requests to Wall. Oct 14 20:18:33 len systemd[1]: Stopping Regular background program processing daemon... Oct 14 20:18:33 len systemd[1]: Stopped Regular background program processing daemon. Oct 14 20:18:33 len systemd[1]: anacron.timer: Adding 1min 25.694112s random time. Oct 14 20:18:33 len systemd[1]: Stopping Deferred execution scheduler... Oct 14 20:18:33 len systemd[1]: Stopped Deferred execution scheduler. Oct 14 20:18:33 len systemd[1]: Stopped Trigger anacron every hour.
what happens here is that I start my backup script. This shuts down the computer at 21:06:29 which is also the time stamp of the last thousands of
Oct 14 21:09:29 len physlock: pam_unix(physlock:auth): auth could not identify password for [xxxx] Oct 14 21:09:29 len physlock: pam_unix(physlock:auth): auth could not identify password for [xxxx] Oct 14 21:09:29 len physlock: pam_unix(physlock:auth): auth could not identify password for [xxxx] Oct 14 21:09:29 len physlock: pam_unix(physlock:auth): auth could not identify password for [root]
Next time this happens is Oct 16th and the problematic lines again start with the time of mounting a network block device to /mnt/xxx-backup and end the the shutdown or suspend of the machine. And the next occurrence again happens when said nbd is mounted for backup purposes.
Ciao; Gregor -- -... --- .-. . -.. ..--.. ...-.-
To sum up, this is what is happening: Computer is idle (not suspended) and locked with physlock; its password prompt is visible. Cron starts the backup process. Part of this process is mounting an encrypted block device. At this time physlock enters the auth-fail-loop.
Can you give me more detail on the backup process/script? Are you prompted for an encryption password?
Hi Bert,
thanks for your patience and interest:
To sum up, this is what is happening: Computer is idle (not suspended) and locked with physlock; its password prompt is visible. Cron starts the backup process.
Mostly, but not exactly. This is a laptop, I decide when to backup. Therefore I plug in an external usb hard drive and may plug in an ethernet cable, then I start the backup manually on the command line. I have to type in the root passwd. If I see everything is OK, I may go away or first enter a physlock command. Even if I go away, physlock will lock the screen after some time, due to a systemd timer (see below).
Actually the same command line starts the backup twice in a row. The backup script decides whether to backup to the network block device or to the usb had drive. The heuristic will favour the usb hard drive the first time, the network block device the second.
This was what happened at Oct 14th. When the second backup run and choose the network block device the screen must already have been locked.
The backup script further stops fetchmail services and cron, anacron and at via invoke-rc.d anacron stop and then does a systemctl stop on anacron.timer, anacron.service and cron.service this is to prevent heavy writing on the ssd which is to be backed up.
Part of this process is mounting an encrypted block device. At this time physlock enters the auth-fail-loop.
so: yes
Can you give me more detail on the backup process/script? Are you prompted for an encryption password?
No, since the password is on the ssd of the laptop. Actually I saw systemd-ask-password once or twice on the terminal after a backup, but do not know why.
I see nothing special there. If you want, I can send you the whole ugly script in private.
More interesting are, I assume, the scripts I use in order to lock the screen:
In ~/.xsession I have:
xss-lock -- systemctl --user start screenlock.service &
xss lock uses screenlock.service in order to lock the screen after a certain amount of inactivity:
~/.config/systemd/user/screenlock.service [Unit] Description=Lock all VTs using physlock Before=sleep.target
[Service] Type=forking ExecStart=/usr/local/bin/physlock -s -d
[Install] WantedBy=suspend.target hybrid-sleep.target
Since xss-lock might die I monitor it and eventually lock the screen with a systemd timer and service:
~/.config/systemd/user/VT-lock.timer [Unit] Description=Lock all VTs using physlock when either already locked nor xss-lock running
[Timer] Unit=VT-lock.service OnBootSec=90 OnUnitActiveSec=60
~/.config/systemd/user/VT-lock.service [Unit] Description=Lock all VTs using physlock when neither already locked nor xss-lock running
[Service] Type=simple ExecStart=/bin/sh -c "set -x ; ps fax | grep '[x]ss-lock -- systemctl --user start screenlock.service' || ps fax | grep '[/]usr/bin/physlock -s -d' || systemctl --user start screenlock.service"
And actually I see a mistake there: Since I use physlock built from sources the grep should read
grep '[/]usr/local/bin/physlock -s -d' ^^^^^^ or even better
egrep '[/]usr/?[^/]*/bin/physlock -s -d'
So in the rare cases that xss-lock is gone (say because the xserver crashed) physlock would be started over and over again but only every minute which is unfortunate but would not write Gigabytes to the file system.
I'll correct the path now but otherwise I am at a loss here.
Ciao; Gregor
I too am at a loss here. Neither do I know whats causing the error, nor do I know how to fix the excessive logging because the error code physlock gets is the same it gets after a wrong password has been entered.
Hi Bert,
I too am at a loss here. Neither do I know whats causing the error, nor do I know how to fix the excessive logging because the error code physlock gets is the same it gets after a wrong password has been entered.
I understand it's neigh to impossible to debug this without having access to my laptop.
I also understand that mine is a singular problem.
But we know that authentication via physlock needs a human to type in a password and therefore cannot fail several times a second. Would it be possible to output second, third etc log messages within the same second only as a aggregate? Something like:
Jul 21 17:01:15 len physlock[30098]: pam_unix(physlock:auth): auth could not identify password for [user] Jul 21 17:01:15 len physlock[30098]: Authentication failure Jul 21 17:01:15 len physlock[30098]: further 376584 times this second: pam_unix(physlock:auth): auth could not identify password for [user] Jul 21 17:01:15 len physlock[30098]: further 376584 times this second: Authentication failure Jul 21 17:01:16 len physlock[30098]: pam_unix(physlock:auth): auth could not identify password for [user] Jul 21 17:01:16 len physlock[30098]: Authentication failure Jul 21 17:01:16 len physlock[30098]: further 83646 times this second: pam_unix(physlock:auth): auth could not identify password for [user] Jul 21 17:01:16 len physlock[30098]: further 83646 times this second: Authentication failure ....
Or even: Jul 21 17:01:17 len physlock[30098]: further 829429 Authentication failures this second
???
It didn't happen again since my last report, but my main issue the danger of my ssd being filled and the system therefore failing.
Ciao; Gregor -- -... --- .-. . -.. ..--.. ...-.-
physlock itself generates no log messages. All the log messages you're seeing are generated in libpam. Also, the whole reason of using PAM is to make non-password authentication possible.
Get some errors looks like this.
Version: physlock 11-35-g15744f5
If i lock screen via sudo physlock -m
, after some time gets messages in journalctl -b | grep physlock
:
Feb 05 14:08:45 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
Feb 05 14:08:45 UbuSSD physlock[25144]: pam_unix(physlock:auth): auth could not identify password for [root]
Feb 05 14:18:46 UbuSSD sudo[2482]: drdeimos : TTY=unknown ; PWD=/home/drdeimos ; USER=root ; COMMAND=/usr/bin/physlock -m
Feb 05 14:18:46 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
Feb 05 14:18:46 UbuSSD physlock[25144]: pam_unix(physlock:auth): auth could not identify password for [drdeimos]
Feb 05 14:28:47 UbuSSD sudo[4649]: drdeimos : TTY=unknown ; PWD=/home/drdeimos ; USER=root ; COMMAND=/usr/bin/physlock -m
Feb 05 14:28:47 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
Feb 05 14:28:47 UbuSSD physlock[25144]: pam_unix(physlock:auth): auth could not identify password for [drdeimos]
Feb 05 14:38:48 UbuSSD sudo[6620]: drdeimos : TTY=unknown ; PWD=/home/drdeimos ; USER=root ; COMMAND=/usr/bin/physlock -m
Feb 05 14:38:48 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
Feb 05 14:38:48 UbuSSD physlock[25144]: pam_unix(physlock:auth): auth could not identify password for [drdeimos]
Feb 05 14:48:49 UbuSSD sudo[8596]: drdeimos : TTY=unknown ; PWD=/home/drdeimos ; USER=root ; COMMAND=/usr/bin/physlock -m
Feb 05 14:48:49 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
Feb 05 14:48:49 UbuSSD physlock[25144]: pam_unix(physlock:auth): auth could not identify password for [root]
Feb 05 14:58:50 UbuSSD sudo[10486]: drdeimos : TTY=unknown ; PWD=/home/drdeimos ; USER=root ; COMMAND=/usr/bin/physlock -m
Feb 05 14:58:50 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
Feb 05 14:58:50 UbuSSD physlock[25144]: pam_unix(physlock:auth): auth could not identify password for [drdeimos]
and on screen it's look like someone try to login multiple times with wrong password. Maybe this is some timeout while physlock communicate with pam.d? Look:
Feb 05 11:56:25 UbuSSD physlock[30558]: pam_unix(physlock:auth): conversation failed
Feb 05 13:26:34 UbuSSD physlock[22695]: pam_unix(physlock:auth): conversation failed
Feb 05 13:38:42 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
Feb 05 13:48:43 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
Feb 05 13:58:44 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
Feb 05 14:08:45 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
Feb 05 14:18:46 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
Feb 05 14:28:47 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
Feb 05 14:38:48 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
Feb 05 14:48:49 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
Feb 05 14:58:50 UbuSSD physlock[25144]: pam_unix(physlock:auth): conversation failed
last 9 times (in this time laptop was locked and i don't interact with him). The same exactly every 10 minutes.
First of all, many thanks for physlock!
Today my laptop woke via rtcwake while I was in the room. Only one hour later I realized heavy fan noise. I tried to unlock but the laptop did not respond to key presses. I shut it down using the power button. Now I realize there are millions of lines in auth.log stating
auth could not identify password for [user]
The first time I see this in auth.log is July 21st:
Jul 21 17:01:15 len physlock[30098]: pam_unix(physlock:auth): auth could not identify password for [user] Jul 21 17:01:15 len physlock[30098]: Authentication failure Jul 21 17:01:15 len physlock[30098]: pam_unix(physlock:auth): auth could not identify password for [user] Jul 21 17:01:15 len physlock[30098]: Authentication failure Jul 21 17:01:15 len physlock[30098]: pam_unix(physlock:auth): auth could not identify password for [user] Jul 21 17:01:15 len physlock[30098]: Authentication failure Jul 21 17:01:15 len physlock[30098]: pam_unix(physlock:auth): auth could not identify password for [root] Jul 21 17:01:15 len physlock[30098]: Authentication failure Jul 21 17:01:15 len physlock[30098]: pam_unix(physlock:auth): auth could not identify password for [user] Jul 21 17:01:15 len physlock[30098]: Authentication failure Jul 21 17:01:15 len physlock[30098]: pam_unix(physlock:auth): auth could not identify password for [user] Jul 21 17:01:15 len physlock[30098]: Authentication failure Jul 21 17:01:15 len physlock[30098]: pam_unix(physlock:auth): auth could not identify password for [user] Jul 21 17:01:15 len physlock[30098]: Authentication failure Jul 21 17:01:15 len physlock[30098]: pam_unix(physlock:auth): auth could not identify password for [root] Jul 21 17:01:15 len physlock[30098]: Authentication failure
$ sudo grep -c 'auth could not identify password for [user]' /var/log/auth.log /var/log/auth.log.1 /var/log/auth.log: 27314371 /var/log/auth.log.1:83549249
There are for instance 8838 lines with date stamp Aug 5 08:45:06 within one second.
From browsing these auth.log files I found among millions of such lines at july 30 a single line where a sudo command is logged which I typed in at the time, presumably (or a remote hacker issuing a command I often issue as sudo). So it looks as if these lines are also generated while I was working on a unlocked system.
This is debian stable (stretch) and the last kernel upgrade to vmlinuz-4.9.0-7-amd64 was at Jul 16 20:13 accordingt to initrd time stamp.
$ physlock -v physlock git-20170918 - lock all consoles
I'll upgrade physlock to git master.