wolfcw / libfaketime

libfaketime modifies the system time for a single application
https://github.com/wolfcw/libfaketime
GNU General Public License v2.0
2.71k stars 325 forks source link

fstat(), stat(), lstat() overrides failing with newer Debian/Ubuntu #362

Closed sirainen closed 2 years ago

sirainen commented 2 years ago

I'm not entirely sure of the reason, but faketime's fstat() overriding no longer works on my recent Debian. Also elsewhere in a recent Ubuntu it doesn't work. I can see in strace that the non-working one shows:

newfstatat(3, "", {st_mode=S_IFREG|0600, st_size=0, ...}, AT_EMPTY_PATH) = 0

I tried patching faketime with a new newfstatat() call, but it doesn't seem to be getting called. Maybe it's called something else like __newfstatat()? I can't figure out how where these changed names come from.

wolfcw commented 2 years ago

Thanks for pointing this out. Based on the man page for newfstatat I'd assume that it would have to be handled similarly to fstatat64, which is already intercepted by libfaketime. I'm not sure why your interceptor function is not called at all; the man page hints that maybe only glibc calls newfstatat, and glibc is one layer below libfaketime. So it'd actually be interesting, which system function the no longer working applications call. I'd actually be surprised if they called newfstatat directly, since that name seems to be architecture-specific. Also, any way you could share your patch, so we can experiment based on it?

sirainen commented 2 years ago

I'm now wondering if I'm understanding this all wrong. faketime should be hooking into glibc functions, not syscalls, right? And the strace output's newfstatat() is a syscall, not glibc function.

If I set a gdb breakpoint to fstat(), it ends up in __GI___fstat64(). If I hook into either that or just __fstat64() then it gets called, but faketime fails at startup with:

libfaketime: In ft_shm_init(), sem_open failed and recreation attempts failed: Bad address
libfaketime: sem_name was /faketime_sem_1856022, created locally: false
sirainen commented 2 years ago

test.patch.txt

This is what I have so far. And .. now I see the copy-paste mistake in there. __fstat64() doesn't have a ver parameter. It no longer crashes, and it is being called, but it's still not returning the right ctime value. But maybe I can now debug this further..

sirainen commented 2 years ago

I was thinking of this in way too complex ways. ltrace shows the correct calls, and they're simply fstat(), stat() and lstat() nowadays. Not sure if it's due to glibc 2.33 or because Debian builds something differently or what.

noone-silent commented 2 years ago

I can reproduce it using a docker container

Build the container with

docker build -t faketime-container -f Dockerfile.txt .

Then run the following:

docker run --rm -it -v $(pwd):/mnt/ -e LD_PRELOAD=/faketime.so faketime-container bash

Inside

strace -o /mnt/strace.log su -c "/testscript.sh" www-data

Dockerfile.txt strace.log

Maybe it helps solving the problem.

sirainen commented 2 years ago

I have a pull request https://github.com/wolfcw/libfaketime/pull/363 that fixes this. I can try cleaning it up further if necessary to get it merged.

noone-silent commented 2 years ago

I have a pull request #363 that fixes this. I can try cleaning it up further if necessary to get it merged.

I changed the Dockerfile to load the libfaketime.c from your commit, the error still exists.

Dockerfile.txt

sirainen commented 2 years ago

Your strace.log is showing fstat() call, so I'd think my patch would work. And it definitely did work for me. I don't see anything clearly wrong in your Dockerfile either. I think at this point I have to say at least that it's too much effort for me to continue debugging..

dkg commented 2 years ago

@sirainen can you give an example of a specific command that doesn't behave properly under faketime on newer Debian/Ubuntu? Having a clear test case would help me to evaluate #363

sirainen commented 2 years ago

I was using this C program:

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <time.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/time.h>

int main(int argc, char *argv[])
{
    unlink("foo");
    int fd = creat("foo", 0600);
    for (int i = 0; i < 20; i++)
        fd = dup(fd); // unnecessary, done just to make sure I'm looking at the right fd in strace/ltrace
    write(fd, "foo", 3);
    struct timeval tv; gettimeofday(&tv, NULL);
    struct stat st; lstat("foo", &st);
    printf("lstat a: %d\n", (int)(st.st_atime - time(NULL)));
    printf("lstat m: %d\n", (int)(st.st_mtime - time(NULL)));
    printf("lstat c: %d\n", (int)(st.st_ctime - time(NULL)));
    stat("foo", &st);
    printf("stat a: %d\n", (int)(st.st_atime - time(NULL)));
    printf("stat m: %d\n", (int)(st.st_mtime - time(NULL)));
    printf("stat c: %d\n", (int)(st.st_ctime - time(NULL)));
    fstat(fd, &st);
    printf("fstat m: %d\n", (int)(st.st_mtime - time(NULL)));
    printf("fstat c: %d\n", (int)(st.st_ctime - time(NULL)));
    gettimeofday(&tv, NULL);
    return 0;
}

When using the Debian-packaged faketime I get:

% /usr/bin/faketime '1 days ago' ~/test
lstat a: 86400
lstat m: 86400
lstat c: 86400
stat a: 86400
stat m: 86400
stat c: 86400
fstat m: 86400
fstat c: 86400

When using my self-built faketime with the patch I get:

% /usr/local/bin/faketime '1 days ago' ~/test
lstat a: 0
lstat m: 0
lstat c: 0
stat a: 0
stat m: 0
stat c: 0
fstat m: 0
fstat c: 0
wolfcw commented 2 years ago

Still pondering here.

I wonder whether we need full implementations of those basically same functions again, or whether aliases or one calling the other would do. Similarly, do we need both, can we have both variants, or are they mutually exclusive on some systems? If so, should we use a compile-time switch to choose, and what should the default be?

Also, the naming confusion (number of underscores in our real_functionname vs. number of underscores in the original _functionname) should probably be cleaned up at this opportunity, though this is some older debt.

sirainen commented 2 years ago

I did a bit larger cleanups, which simplifies adding more stat-like calls. What do you think of those?

I don't think there's a problem hooking into all of them always. The ones that get used will be used, and others just don't. And in the new Debian it seems to be actually using two of those calls. Like in the glibc (or some other library?) initialization it was using one of the older calls, while in the actual code it's then using the newer fstat/stat/lstat() calls. So similarly it's possible that there could be statically linked binaries using different kinds of stat-calls than what glibc otherwise would.

wolfcw commented 2 years ago

Thanks, good thinking and I like the new version. Planning on merging unless @dkg has objections based on insights into the future Debian / glibc combination.

Long time ago, the FAKE_INTERNAL_CALLS flag was added to make interception of some "internal" functions (those prefixed __) optional, given some looping / double-fake issues back then. Unless we run into something similar on some platforms here, intercepting them all should suffice for backwards-compatibility.

noone-silent commented 2 years ago

I can reproduce it using a docker container

Build the container with

docker build -t faketime-container -f Dockerfile.txt .

Then run the following:

docker run --rm -it -v $(pwd):/mnt/ -e LD_PRELOAD=/faketime.so faketime-container bash

Inside

strace -o /mnt/strace.log su -c "/testscript.sh" www-data

Dockerfile.txt strace.log

Maybe it helps solving the problem.

I still got the same error running my test here with the newest release

wolfcw commented 2 years ago

Thanks for reminding about this.

Admittedly, I'm having a bit of a hard time following here. What error or unexpected output do you get?

I guess I'm failing to spot where you set a specific FAKETIME.

Also, su is a suidroot binary and therefor LD_PRELOADing libfaketime probably will not work at all.

Can you try with LD_PRELOADing libfaketime and setting the appropriate environment variables within your testscript.sh, and then, e.g., capture the output by redirecting it to a text file?

noone-silent commented 2 years ago

I tried with su -m but that didn't work. Passing everything down into su, like su -c 'export LD_PRELOAD=/faketime.so; cmd', itself is working

wolfcw commented 2 years ago

OK, but that sounds rather like facing a by-design limitation of LD_PRELOAD concerning suidroot (or statically linked) binaries, not like a problem with specific intercepted function calls.

noone-silent commented 2 years ago

Yes. I got the same error like the others (ft_shm_init)

I'm now wondering if I'm understanding this all wrong. faketime should be hooking into glibc functions, not syscalls, right? And the strace output's newfstatat() is a syscall, not glibc function.

If I set a gdb breakpoint to fstat(), it ends up in __GI___fstat64(). If I hook into either that or just __fstat64() then it gets called, but faketime fails at startup with:

libfaketime: In ft_shm_init(), sem_open failed and recreation attempts failed: Bad address
libfaketime: sem_name was /faketime_sem_1856022, created locally: false

I didn't know you need to pass everything down manually if I'm using su

wolfcw commented 2 years ago

Well, LD_PRELOAD settings are ignored for suidroot binaries, such as su, for security reasons. No way around that.

Any chance your could try outside of your Docker containers, or check their permissions? Potentially libfaketime's ft_shm_init() fails due to restrictions regarding shared memory on your system ("Bad address"). You should also delete stale semaphores and shared memory segments (the "created locally: false" indicates either the use of the faketime wrapper, or there are stale files lying around). As long as libfaketime cannot initialize properly, eventually non-working interceptions of calls like fstat64() cannot really be diagnosed.

wolfcw commented 2 years ago

I consider the original problem to be solved with the 0.9.10 release. Feel free to re-open or open another issue if the sem_open issue persists.