termux / proot

An chroot-like implementation using ptrace.
https://wiki.termux.com/wiki/PRoot
Other
751 stars 161 forks source link

Strange `ls` behavior after Kali upgrade #154

Open corbinlc opened 3 years ago

corbinlc commented 3 years ago

@michalbednarski I am not sure yet if this is a proot issue, but I am currently thinking it is not. Regardless, I wanted your thoughts on this matter, because it does seem like the type of problem you often get to the bottom of. In UserLAnd, which uses PRoot (you know this, but writing that for anyone else reading) Kali Linux works fine currently, though the image installed is a bit stale. If after you run sudo apt upgrade to catch up to current, the ls command starts returning the normal list of files, but then prints out that each one does not exist. The image is based on kali-rolling which pick some pretty recent packages into its build, so I am a little bothered that this might start showing up on other prooted images running on Android in the future. I will drop some logs related to this on this issue for your review, but I am also wondering if you have any other thoughts on what could be going on.

corbinlc commented 3 years ago

Here is what the error looks like to the user:

say you are in a director that has two files a.txt and b.txt and you run ls

me@localhost:~$ ls
ls: cannot access 'a.txt': No such file or directory
ls: cannot access 'b.txt': No such file or directory
a.txt
b.txt

So it says none of them exist but then prints them out anyway. Doing cat a.txt or other things that use the files work normally. Trying to ls anything within a directory that is part of a binding returns the errors but doesn't print them out afterwards.

michalbednarski commented 3 years ago

Most probably this is related to statx(2) syscall, I've confirmed that old version doesn't use it while new does, although so far I haven't looked further as issue didn't reproduce on my phone.

(Some commits related to statx are referenced in #122)

corbinlc commented 3 years ago

I saw those commit when looking through how stale UserLAnd proot was yesterday. I don't see statx being called though, but I should catch up. Here is the interesting part of the proot log for the failing and passing cases (this is running ls on my home directory before and after apt upgrade).

Failing:

proot info: vpid 20: sysenter start: openat(0xffffffffffffff9c, 0x300003b730, 0x84800, 0x0, 0x300001ae82, 0x300003b732) = 0xffffffffffffff9c [0x7ff45c3eb0, 0]
proot info: vpid 20: translate("/home/corbin" + ".")
proot info: vpid 20:          -> "/data/user/0/tech.ula/files/1/home/corbin/."
proot info: vpid 20: sysenter end: openat(0xffffffffffffff9c, 0x7ff45c3e84, 0x84800, 0x0, 0x300001ae82, 0x300003b732) = 0xffffffffffffff9c [0x7ff45c3eb0, 0]
proot info: vpid 20: sysenter start: fstat(0x3, 0x7ff45c3e88, 0x7ff45c3e88, 0x3, 0x300001ae82, 0x300003b732) = 0x3 [0x7ff45c3e60, 0]
proot info: vpid 20: sysenter end: fstat(0x3, 0x7ff45c3e88, 0x7ff45c3e88, 0x3, 0x300001ae82, 0x300003b732) = 0x3 [0x7ff45c3e60, 0]
proot info: vpid 20: sysexit start: fstat(0x0, 0x7ff45c3e88, 0x7ff45c3e88, 0x3, 0x300001ae82, 0x300003b732) = 0x0 [0x7ff45c3e60, 0]
proot info: vpid 20: sysexit end: fstat(0x0, 0x7ff45c3e88, 0x7ff45c3e88, 0x3, 0x300001ae82, 0x300003b732) = 0x0 [0x7ff45c3e60, 0]
proot info: vpid 20: sysenter start: getdents64(0x3, 0x300003b780, 0x8000, 0x7fffffff, 0x4, 0x3) = 0x3 [0x7ff45c3ec0, 0]
proot info: vpid 20: sysenter end: getdents64(0x3, 0x300003b780, 0x8000, 0x7fffffff, 0x4, 0x3) = 0x3 [0x7ff45c3ec0, 0]
proot info: vpid 20: sysexit start: getdents64(0xd8, 0x300003b780, 0x8000, 0x7fffffff, 0x4, 0x3) = 0x1c0 [0x7ff45c3ec0, 0]
proot info: vpid 20: sysexit end: getdents64(0xd8, 0x300003b780, 0x8000, 0x7fffffff, 0x4, 0x3) = 0x158 [0x7ff45c3ec0, 0]
proot info: vpid 20: sysenter start: getdents64(0x3, 0x300003b780, 0x8000, 0x7fffffff, 0x300003b82e, 0x300004379b) = 0x3 [0x7ff45c3ec0, 0]
proot info: vpid 20: sysenter end: getdents64(0x3, 0x300003b780, 0x8000, 0x7fffffff, 0x300003b82e, 0x300004379b) = 0x3 [0x7ff45c3ec0, 0]
proot info: vpid 20: sysexit start: getdents64(0x0, 0x300003b780, 0x8000, 0x7fffffff, 0x300003b82e, 0x300004379b) = 0x0 [0x7ff45c3ec0, 0]
proot info: vpid 20: sysexit end: getdents64(0x0, 0x300003b780, 0x8000, 0x7fffffff, 0x300003b82e, 0x300004379b) = 0x0 [0x7ff45c3ec0, 0]
proot info: vpid 20: sysenter start: fstat(0x1, 0x7ff45c1ce8, 0x7ff45c1ce8, 0x1, 0x7afc70f944, 0x8080808080800000) = 0x1 [0x7ff45c1cb0, 0]
proot info: vpid 20: sysenter end: fstat(0x1, 0x7ff45c1ce8, 0x7ff45c1ce8, 0x1, 0x7afc70f944, 0x8080808080800000) = 0x1 [0x7ff45c1cb0, 0]
proot info: vpid 20: sysexit start: fstat(0x0, 0x7ff45c1ce8, 0x7ff45c1ce8, 0x1, 0x7afc70f944, 0x8080808080800000) = 0x0 [0x7ff45c1cb0, 0]
proot info: vpid 20: sysexit end: fstat(0x0, 0x7ff45c1ce8, 0x7ff45c1ce8, 0x1, 0x7afc70f944, 0x8080808080800000) = 0x0 [0x7ff45c1cb0, 0]
proot info: vpid 20: exited with status 1

Passing:

proot info: vpid 20: sysenter start: openat(0xffffffffffffff9c, 0x300003a6f0, 0x84800, 0x0, 0x300001ab42, 0x300003a6f2) = 0xffffffffffffff9c [0x7ffaa6bc80, 0]
proot info: vpid 20: translate("/home/corbin" + ".")
proot info: vpid 20:          -> "/data/user/0/tech.ula/files/2/home/corbin/."
proot info: vpid 20: sysenter end: openat(0xffffffffffffff9c, 0x7ffaa6bc54, 0x84800, 0x0, 0x300001ab42, 0x300003a6f2) = 0xffffffffffffff9c [0x7ffaa6bc80, 0]
proot info: vpid 20: sysenter start: fstat(0x3, 0x7ffaa6bc58, 0x7ffaa6bc58, 0x3, 0x300001ab42, 0x300003a6f2) = 0x3 [0x7ffaa6bc30, 0]
proot info: vpid 20: sysenter end: fstat(0x3, 0x7ffaa6bc58, 0x7ffaa6bc58, 0x3, 0x300001ab42, 0x300003a6f2) = 0x3 [0x7ffaa6bc30, 0]
proot info: vpid 20: sysexit start: fstat(0x0, 0x7ffaa6bc58, 0x7ffaa6bc58, 0x3, 0x300001ab42, 0x300003a6f2) = 0x0 [0x7ffaa6bc30, 0]
proot info: vpid 20: sysexit end: fstat(0x0, 0x7ffaa6bc58, 0x7ffaa6bc58, 0x3, 0x300001ab42, 0x300003a6f2) = 0x0 [0x7ffaa6bc30, 0]
proot info: vpid 20: sysenter start: getdents64(0x3, 0x300003a740, 0x8000, 0x7fffffff, 0x4, 0x3) = 0x3 [0x7ffaa6bc90, 0]
proot info: vpid 20: sysenter end: getdents64(0x3, 0x300003a740, 0x8000, 0x7fffffff, 0x4, 0x3) = 0x3 [0x7ffaa6bc90, 0]
proot info: vpid 20: sysexit start: getdents64(0xd8, 0x300003a740, 0x8000, 0x7fffffff, 0x4, 0x3) = 0xd8 [0x7ffaa6bc90, 0]
proot info: vpid 20: sysexit end: getdents64(0xd8, 0x300003a740, 0x8000, 0x7fffffff, 0x4, 0x3) = 0xd8 [0x7ffaa6bc90, 0]
proot info: vpid 20: sysenter start: fstatat64(0xffffffffffffff9c, 0x7ffaa6b930, 0x3000035ee8, 0x100, 0x8, 0x0) = 0xffffffffffffff9c [0x7ffaa6b930, 0]
proot info: vpid 20: translate("/home/corbin" + "index.html")
proot info: vpid 20:          -> "/data/user/0/tech.ula/files/2/home/corbin/index.html"
proot info: vpid 20: sysenter end: fstatat64(0xffffffffffffff9c, 0x7ffaa6b8fb, 0x3000035ee8, 0x100, 0x8, 0x0) = 0xffffffffffffff9c [0x7ffaa6b8fb, 0]
proot info: vpid 20: sysexit start: fstatat64(0x0, 0x7ffaa6b8fb, 0x3000035ee8, 0x100, 0x8, 0x0) = 0x0 [0x7ffaa6b8fb, 0]
proot info: vpid 20: sysexit end: fstatat64(0x0, 0x7ffaa6b930, 0x3000035ee8, 0x100, 0x8, 0x0) = 0x0 [0x7ffaa6b930, 0]
proot info: vpid 20: sysenter start: getdents64(0x3, 0x300003a740, 0x8000, 0x7fffffff, 0x300003a7ee, 0x300004275b) = 0x3 [0x7ffaa6bc90, 0]
proot info: vpid 20: sysenter end: getdents64(0x3, 0x300003a740, 0x8000, 0x7fffffff, 0x300003a7ee, 0x300004275b) = 0x3 [0x7ffaa6bc90, 0]
proot info: vpid 20: sysexit start: getdents64(0x0, 0x300003a740, 0x8000, 0x7fffffff, 0x300003a7ee, 0x300004275b) = 0x0 [0x7ffaa6bc90, 0]
proot info: vpid 20: sysexit end: getdents64(0x0, 0x300003a740, 0x8000, 0x7fffffff, 0x300003a7ee, 0x300004275b) = 0x0 [0x7ffaa6bc90, 0]
proot info: vpid 20: sysenter start: fstat(0x1, 0x7ffaa69728, 0x7ffaa69728, 0x1, 0x7a8062f944, 0xa) = 0x1 [0x7ffaa696f0, 0]
proot info: vpid 20: sysenter end: fstat(0x1, 0x7ffaa69728, 0x7ffaa69728, 0x1, 0x7a8062f944, 0xa) = 0x1 [0x7ffaa696f0, 0]
proot info: vpid 20: sysexit start: fstat(0x0, 0x7ffaa69728, 0x7ffaa69728, 0x1, 0x7a8062f944, 0xa) = 0x0 [0x7ffaa696f0, 0]
proot info: vpid 20: sysexit end: fstat(0x0, 0x7ffaa69728, 0x7ffaa69728, 0x1, 0x7a8062f944, 0xa) = 0x0 [0x7ffaa696f0, 0]
proot info: vpid 20: exited with status 0
corbinlc commented 3 years ago

I am wondering if they compiled coreutils with READDIR_LIES_ABOUT_MOUNTPOINT_D_INO. ls will fail if the ino returned by readir/getdents doesn't match the real ino in that case. I will try catching up on proot commits, need to re-merge anyway at some point. Your way of handling shmget and related to probably better than what I put in a while back. I will also check if the ino matching does in fact matter.

corbinlc commented 3 years ago

That is not it. The inodes returned are the same as what you get from calling stat. We running with the hidden_files extension, but the issue shows up when we do or do not use that extension.

michalbednarski commented 3 years ago

Clearly something statx-related, I've LD_PRELOAD'ed following into ls

#define _GNU_SOURCE
#include <stddef.h>
#include <stdio.h>
#include <linux/stat.h>
#include <fcntl.h>
#include <dlfcn.h>
#include <errno.h>

static int(*real_statx)(int dirfd, const char *pathname, int flags, unsigned int mask, struct statx *statxbuf);
int statx(int dirfd, const char *pathname, int flags, unsigned int mask, struct statx *statxbuf) {
    int saved_errno;
    int ret;
    char buf[100];

    // Find real statx() from libc
    if (real_statx == NULL) {
        real_statx = dlsym(RTLD_NEXT, "statx");
    }

    // Note before call
    saved_errno = errno;
    snprintf(buf, sizeof(buf), "/aaa_before_errno=%d", ret, errno);
    open(buf, O_RDONLY);
    errno = saved_errno;

    // Call real function
    ret = real_statx(dirfd, pathname, flags, mask, statxbuf);

    // Note after call
    saved_errno = errno;
    snprintf(buf, sizeof(buf), "/aaa_after_ret=%d_errno=%d", ret, errno);
    open(buf, O_RDONLY);
    errno = saved_errno;

    return ret;
}

and have got

proot info: vpid 30: sysenter start: openat(0xffffffffffffff9c, 0x7fefdc2230, 0x0, 0x0, 0x40100401, 0x5550000055100000) = 0xffffffffffffff9c [0x7fefdc2180, 0]
proot info: vpid 30: translate("/" + "/aaa_before_errno=48")
proot info: vpid 30:          -> "/data/data/tech.ula/files/1/aaa_before_errno=48"
proot info: vpid 30: sysenter end: openat(0xffffffffffffff9c, 0x7fefdc2150, 0x0, 0x0, 0x40100401, 0x5550000055100000) = 0xffffffffffffff9c [0x7fefdc2180, 0]
proot info: vpid 30: seccomp SIGSYS: void(0xffffffffffffff9c, 0x7fefdc23e0, 0x100, 0x2, 0x7fefdc22d8, 0x7aa5f045f0) = 0xffffffffffffff9c [0x7fefdc2200, 0]
proot info: SIGSYS PR_void handled, but not for SYSCALL_AVOIDER. Falling through to default handling
proot info: SIGSYS. Return set to -ENOSYS
proot info: vpid 30: sysenter start: fstatat64(0xffffffffffffff9c, 0x7fefdc23e0, 0x7fefdc2078, 0x100, 0x100, 0xffffff9c) = 0xffffffffffffff9c [0x7fefdc2050, 0]
proot info: vpid 30: translate("/home/user" + "old_ls")
proot info: vpid 30:          -> "/data/data/tech.ula/files/1/home/user/old_ls"
proot info: vpid 30: sysenter end: fstatat64(0xffffffffffffff9c, 0x7fefdc2023, 0x7fefdc2078, 0x100, 0x100, 0xffffff9c) = 0xffffffffffffff9c [0x7fefdc2023, 0]
proot info: vpid 30: sysexit start: fstatat64(0x0, 0x7fefdc2023, 0x7fefdc2078, 0x100, 0x100, 0xffffff9c) = 0x0 [0x7fefdc2023, 0]
proot info: vpid 30: sysexit end: fstatat64(0x0, 0x7fefdc23e0, 0x7fefdc2078, 0x100, 0x100, 0xffffff9c) = 0x0 [0x7fefdc2050, 0]
proot info: vpid 30: sysenter start: openat(0xffffffffffffff9c, 0x7fefdc2230, 0x0, 0x0, 0x40100401, 0x51000040000000) = 0xffffffffffffff9c [0x7fefdc2180, 0]
proot info: vpid 30: translate("/" + "/aaa_after_ret=0_errno=38")
proot info: vpid 30:          -> "/data/data/tech.ula/files/1/aaa_after_ret=0_errno=38"
proot info: vpid 30: sysenter end: openat(0xffffffffffffff9c, 0x7fefdc214b, 0x0, 0x0, 0x40100401, 0x51000040000000) = 0xffffffffffffff9c [0x7fefdc2180, 0]

"seccomp SIGSYS" here was for statx, which here triggered errno to be set to 38 and therefore triggered fallback in glibc.

glibc statx source is here, __NR_statx was defined, __ASSUME_STATX wasn't, checked on libc6/kali-rolling,now 2.31-9 arm64 [installed].

PR_void suggests that statx syscall is not even registered in proot's syscall table, so chances are that this issue is already fixed in Termux fork (although you'd need to check that, either way if you'd get this to reproduce in Termux that it is easier for me to debug)

For statx there are multiple combinations to support (depending on Android and Linux version there are following options, I think each of them is present on some real devices):

corbinlc commented 3 years ago

I can confirm this is already fixed in the latest code base. I have a little more work to merge, but it does look like this issue is solved.

corbinlc commented 3 years ago

Btw... Thanks for looking into this.

corbinlc commented 3 years ago

@michalbednarski I finished up the merge ( https://github.com/CypherpunkArmory/proot/tree/merge-it ). I mostly went with what you had, except for a couple of differences related to specific bugs users encountered. It looks to be mostly working but one issue popped out on testing.

I previously had a incomplete solution for handling shmget/shmat and a few other things, probably the most substantial addition to UserLAnd proot since last merge back to termux (you might find it interesting: https://github.com/CypherpunkArmory/proot/blob/master/src/extension/fake_id0/shm.c ). I never merged it back because there were some bugs I never had time to debug them and I had to mostly stop work on UserLAnd (I think I forgot to close one file handle, because people were reporting running out of file handles after quite a bit of time using applications that use shared memory). Your solution sounds more robust and certainly handles more system calls, so I merged that in and deleted mine.

I am getting one error from sysvipc_shm.c currently when running pg_createcluster 13 main --start after sudo apt install postgresql. Specifically this seems to be failing:

        if (request->op == SHMHELPER_ALLOC) {
                int fd = -1;
                read(helper2proot, &fd, sizeof(fd));  <-- this returns a negative fd
                return fd;
        }

I haven't analyzed what all this extension does yet, just the intent and where the error pops up. What do you think I should look at next?

Note: The interesting differences are that I still build off of the android-5 branch ( https://github.com/CypherpunkArmory/UserLAnd-Assets-Support/blob/docker-build/input/main.sh ) and use a variety of environment variables and bindings when calling proot ( https://github.com/CypherpunkArmory/UserLAnd-Assets-Support/blob/staging/assets/all/execInProot.sh ).

Any thoughts would be appreciated.

corbinlc commented 3 years ago

Any thoughts on this @michalbednarski ? Maybe I will just merge the statx part of this right now, but I do want to get all merged up in the near future. Thanks!

michalbednarski commented 3 years ago

So far I've checked that building and running package in Termux from your branch works correctly.

So far I haven't looked further but few things that might help debugging further:

corbinlc commented 3 years ago

Before bringing this up, I did confirm that the prootshm file was getting created. I will work through the other suggestions and see where I end up. Thank you!

corbinlc commented 3 years ago

@michalbednarski I finally got back to debugging this. This is related to UserLAnd targeting Android 10. See this behavior change:

Shared memory
Ashmem has changed the format of dalvik maps in /proc/<pid>/maps, affecting apps that directly parse the maps file. Application developers should test the /proc/<pid>/maps format on devices that run Android 10 or higher and parse accordingly if the app depends on dalvik map formats.

Apps targeting Android 10 cannot directly use ashmem (/dev/ashmem) and must instead access shared memory via the NDK’s ASharedMemory class. In addition, apps cannot make direct IOCTLs to existing ashmem file descriptors and must instead use either the NDK’s ASharedMemory class or the Android Java APIs for creating shared memory regions. This change increases security and robustness when working with shared memory, improving performance and security of Android overall.

So when the shm helper gets to trying to allocated memory it is failing to open /dev/ashmem. Any thoughts on this? I am going to look at how easy it is to use the mentioned class.

corbinlc commented 3 years ago

In the same branch I have been playing with ( https://github.com/CypherpunkArmory/proot/tree/merge-it ) it looks to be mostly working now, but very little testing so far. Had to do the following, include <android\sharedmem.h>, switch to using ASharedMemory_create(), link with -landroid and make it so the build targeted API level 26 or newer. That last one will require a different build for earlier versions of android which don't have ASharedMemory_create(). Going to have to see how to improve upon that and do a bit of testing.