i3 / i3lock

improved screen locker
https://i3wm.org/i3lock
BSD 3-Clause "New" or "Revised" License
921 stars 404 forks source link

i3lock hangs / crashes if a successful authentication is immediately followed by another authentication attempt (race condition) #254

Closed joanbm closed 4 years ago

joanbm commented 4 years ago

I'm submitting a…

[X ] Bug [ ] Feature Request [ ] Other (Please describe in detail)

Current Behavior

I found that, if one successfully authenticates (i.e. types the correct password and then presses ENTER), but then immediately attempts a further authentication (i.e. types a correct or incorrect password and then presses ENTER), it's possible that i3lock hangs (in my systems, the behaviour is that i3lock stays locked and it's impossible to unlock it even when using the correct password). Analysis with valgrind shows that invalid memory accesses cause this behaviour, so other effects like a crash may also be possible.

In the previous paragraph, immediately means "before the previous authentication event was completely processed by i3lock". So, this is a race condition that usually happens under a very precise, sub-second timeframe. Later, I provide a way to easily reproduce this race condition and a more detailed analysis.

Expected Behavior

Since there was already a successful authentication event, the locker should gracefully shut down and the later authentication events should be ignored.

Reproduction Instructions

Since this is a race condition, it may be difficult to reproduce this problem, let alone consistently. However, I found a way to reproduce this problem with 100% consistency.

For reproducing the problem, I recommend changing your password to something short (but not empty). For example, change your password to just a single "1" character.

Next, introduce a sleep statement in the i3lock.c main function, just before starting the event loop, and recompile the code:

    ev_prepare_init(xcb_prepare, xcb_prepare_cb);
    ev_prepare_start(main_loop, xcb_prepare);

    /* Invoke the event callback once to catch all the events which were
     * received up until now. ev will only pick up new events (when the X11
     * file descriptor becomes readable). */
+    sleep(3);
    ev_invoke(main_loop, xcb_check, 0);
    ev_loop(main_loop, 0);

To reproduce the problem, launch the just-compiled i3lock binary and start alternately spamming your password and ENTER, e.g. alternately press "1" and ENTER. You should be able to shortly reproduce the problem. Since it may not have the same behavior as in my system, I recommend also attaching valgrind so you can at least observe the invalid memory accesses.

(Make sure you have access to an alternate TTY so you can kill i3lock in case the behavior is a hang)

Analysis of the valgrind output shows many invalid memory access like the following:

==4641== Invalid read of size 4
==4641==    at 0x4E4F5BC: pam_authenticate (in /lib/libpam.so.0.84.2)
==4641==    by 0x10D695: input_done (in /home/zealcharm/i3lock-2.12/x86_64-pc-linux-musl/i3lock)
==4641==    by 0x10D51E: finish_input (in /home/zealcharm/i3lock-2.12/x86_64-pc-linux-musl/i3lock)
==4641==    by 0x10DBF6: handle_key_press (in /home/zealcharm/i3lock-2.12/x86_64-pc-linux-musl/i3lock)
==4641==    by 0x10EBFE: xcb_check_cb (in /home/zealcharm/i3lock-2.12/x86_64-pc-linux-musl/i3lock)
==4641==    by 0x10FB65: main (in /home/zealcharm/i3lock-2.12/x86_64-pc-linux-musl/i3lock)
==4641==  Address 0x48bbc78 is 8 bytes inside a block of size 368 free'd
==4641==    at 0x489F74F: free (vg_replace_malloc.c:540)
==4641==    by 0x4E502D3: pam_end (in /lib/libpam.so.0.84.2)
==4641==    by 0x10D6DD: input_done (in /home/zealcharm/i3lock-2.12/x86_64-pc-linux-musl/i3lock)
==4641==    by 0x10D51E: finish_input (in /home/zealcharm/i3lock-2.12/x86_64-pc-linux-musl/i3lock)
==4641==    by 0x10DBF6: handle_key_press (in /home/zealcharm/i3lock-2.12/x86_64-pc-linux-musl/i3lock)
==4641==    by 0x10EBFE: xcb_check_cb (in /home/zealcharm/i3lock-2.12/x86_64-pc-linux-musl/i3lock)
==4641==    by 0x10FB65: main (in /home/zealcharm/i3lock-2.12/x86_64-pc-linux-musl/i3lock)
==4641==  Block was alloc'd at
==4641==    at 0x48A06F2: calloc (vg_replace_malloc.c:762)
==4641==    by 0x4E53474: pam_start (in /lib/libpam.so.0.84.2)
==4641==    by 0x10F340: main (in /home/zealcharm/i3lock-2.12/x86_64-pc-linux-musl/i3lock)

Maybe this is also related to issue #252 (some of the invalid accesses are reported in pam_set_item like in the Fedora crash report). Full log on Arch Linux, with PAM debug symbols valgrind_log.txt.zip

(If you're wondering how I originally ran into this problem, given that it's caused by such a precise race condition, it's because I was using a patched version of i3lock that allows me to authenticate with a USB key instead of with a password. So, to log in, I was just spamming/holding ENTER after inserting the USB key. This was enough for me to reproduce the problem without any synthetic sleep statement).

Technical explanation

The problem is the way the input_done function is implemented:

static void input_done(void) {
    [...]

    if (pam_authenticate(pam_handle, 0) == PAM_SUCCESS) {
        DEBUG("successfully authenticated\n");
        clear_password_memory();

        /* PAM credentials should be refreshed, this will for example update any kerberos tickets.
         * Related to credentials pam_end() needs to be called to cleanup any temporary
         * credentials like kerberos /tmp/krb5cc_pam_* files which may of been left behind if the
         * refresh of the credentials failed. */
        pam_setcred(pam_handle, PAM_REFRESH_CRED);
        pam_end(pam_handle, PAM_SUCCESS);

        ev_break(EV_DEFAULT, EVBREAK_ALL);
        return;
    }

    [...]
}

On a successful authentication event, pam_authenticate returns PAM_SUCCESS, and then pam_end is called, releasing the resources associated with the PAM handle. Then, it calls ev_break to exit the main loop and thus exit the locker application and unlock the screen.

However, calling ev_break does not immediately exit the main loop, but it only does this after all pending events have been processed. From man ev:

       ev_break (loop, how)
           Can be used to make a call to "ev_run" return early (but only after it has
           processed all outstanding events).

Therefore, by quickly causing another authentication event after a successful authentication event, it's possible to reenter input_done after the PAM handle has been released. This causes the call to pam_authenticate to use an already-released PAM handle which causes the invalid memory reads and incorrect behavior reported.

Running a git blame shows that i3lock formerly used to call exit(0) immediately after pam_end, so this wasn't a problem before. However, this was changed in commit 5b4d45a8aff1dacaad364ae6df70c3a6a5067701.

To fix the problem, it should be enough to move the call to pam_end to the cleanup code in main after the main loop exits, instead of immediately releasing it after pam_authenticate (I am not sure if the call to pam_setcred should also be moved or not, since I'm not familiar with the problem it's trying to solve).

Environment

Output of i3lock --version:

i3lock: version 2.12 © 2010 Michael Stapelberg

Problem reproduced 2020-01-02 on Arch Linux, Alpine Linux Edge, on both i3lock 2.12 and git master

joanbm commented 4 years ago

For easier reproduction without messing too much with your system you can use this Dockerfile (run with xhost local:root && sudo docker build . -t i3lock_docker && sudo docker run -it --rm --env="DISPLAY" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" i3lock_docker):

FROM alpine:edge

# Install build tools + i3lock depends and makedepends
RUN apk add --update \
    make gcc musl-dev valgrind \
    xkeyboard-config \
    libev-dev cairo-dev linux-pam-dev libxkbcommon-dev \
    xcb-util-image-dev xcb-util-xrm-dev \
  && rm -rf /var/cache/apk/*

# Download + build i3lock (with easy race condition trigger patch)
RUN wget https://i3wm.org/i3lock/i3lock-2.12.tar.bz2
RUN echo '84f1558368381bcad9a64f41ab6134a6614dea453d1ee5ecfe886185b9e1baebeeca446c4635158deb8dae5b25c09d47d3990239d76c44e5325ca5bfaad9b2ad  i3lock-2.12.tar.bz2' |sha512sum -c - 

RUN tar xf i3lock-2.12.tar.bz2
WORKDIR i3lock-2.12
RUN sed -i -e 's/ev_invoke/sleep(3); ev_invoke/g' i3lock.c
RUN ./configure && make -j

# Attempt to reproduce the problem
RUN echo "root:1" | chpasswd
RUN echo -e '#!/bin/sh \n\
echo i3lock will launch shortly. To reproduce the problem, \n\
echo alternately press 1 and ENTER after the launcher starts \n\
sleep 5 \n\
valgrind x86_64-pc-linux-musl/i3lock &disown \n\
sleep 5 \n\
pkill -9 valgrind' > launchscript.sh
ENTRYPOINT sh launchscript.sh
stapelberg commented 4 years ago

Thanks for this incredibly well-researched report. I’ll try reproducing and fixing it once I get a chance

stapelberg commented 4 years ago

Now fixed in master. Let me know if there’s anything left to fix, and thanks again for the great report :)

joanbm commented 4 years ago

@stapelberg Thanks and my pleasure 😉! Fix is working correctly on my side.