balena-io / qemu

QEMU with additional QEMU_EXECVE flag that persists emulator after an execve
Other
62 stars 28 forks source link

linux-user/scriptload.c: load_script_file returns pointer to local variable #26

Open andersnm opened 1 year ago

andersnm commented 1 year ago

Hi!

In the load_script_file function:

The buf variable is defined as a local variable on the stack: https://github.com/balena-io/qemu/blob/639d1d8903f65d74eb04c49e0df7a4b2f014cd86/linux-user/scriptload.c#L15

A cp pointer points at buf: https://github.com/balena-io/qemu/blob/639d1d8903f65d74eb04c49e0df7a4b2f014cd86/linux-user/scriptload.c#L41

The i_name pointer is assigned from cp: https://github.com/balena-io/qemu/blob/639d1d8903f65d74eb04c49e0df7a4b2f014cd86/linux-user/scriptload.c#L60

The return value is assigned from i_name pointing into the stack: https://github.com/balena-io/qemu/blob/639d1d8903f65d74eb04c49e0df7a4b2f014cd86/linux-user/scriptload.c#L79

It seems the buf variable gets smashed when working with 4kb+ of parameter data. This could solve the issue with aclocal: error: echo failed with exit status: 1 in cross builds.

andersnm commented 1 year ago

Hi again,

Have tested a fix and can confirm it fixes the aclocal: error: echo failed with exit status: 1 errors after autoreconf -i or autogen.sh. Likely the same as reported here: https://github.com/balena-io-library/base-images/issues/688 https://github.com/balena-io-library/base-images/issues/644

The following applies to the current balena-7.0.0 branch

diff --git a/linux-user/scriptload.c b/linux-user/scriptload.c
index 74ac54bcd2..19d297af9f 100644
--- a/linux-user/scriptload.c
+++ b/linux-user/scriptload.c
@@ -72,11 +72,11 @@ int load_script_file(const char *filename, struct linux_binprm *bprm)
         }

         if (i_arg) {
-            new_argp = alloca(sizeof(void *));
-            new_argp[0] = i_arg;
+            new_argp = malloc(sizeof(void *));
+            new_argp[0] = strdup(i_arg);
         }
         bprm->argv = new_argp;
-        bprm->filename = i_name;
+        bprm->filename = strdup(i_name);
     } else {
         return 1;
     }
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 3a4eb073d8..4976c7d1cc 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8581,12 +8581,16 @@ static abi_long qemu_execve(char *filename, char *argv[],
         new_argp[argc + offset] = NULL;

         if (ret==0) {
-            new_argp[3] = bprm->filename;
-            new_argp[4] = bprm->filename;
+            new_argp[3] = strdupa(bprm->filename);
+            new_argp[4] = strdupa(bprm->filename);

             if (bprm->argv != NULL) {
-                new_argp[5] = bprm->argv[0];
+                new_argp[5] = strdupa(bprm->argv[0]);
+                free(bprm->argv[0]);
+                free(bprm->argv);
             }
+
+            free(bprm->filename);
         } else {
             new_argp[3] = argv[0];
         } 

Besides this, there seems to be more dubious stuff in the related syscall.c bits. Not sure if args to the execve should be strdup'ed or rather strdupa'ed? or something else

andersnm commented 1 year ago

The problem was isolated and reproduced through an interactive docker shell based on balenalib/raspberrypi3-debian:buster with apt install build-essential gcc.

The repro basically does the following:

execve("/bin/sh",  [ "sh", "-c", "hashbangscript with many parameters" ]

hashbangscript could be any script that runs through an interpreter. In my case /usr/bin/autom4te (a Perl script) was invoked by autoreconf (via aclocal, both Perl scripts) with a lot of parameters.

You need two files:

nano /usr/bin/hashbangscript and paste the following:

#!/usr/bin/python3

# Same thing with perl:
#!/usr/bin/perl

print("Hello from hashbang\n");

chmod +x /usr/bin/hashbangscript to make it executable.

nano repro.c and paste the following:

#include <stdio.h>
#include <unistd.h>
#include <errno.h>

void main() {

  char* argv[] = {"sh","-c","hashbangscript PASTE 4K OF RANDOM DATA HERE", NULL};
  char* envp[] = { NULL };

  printf("before execve\n");
  int ret = execve("/bin/sh", argv, envp);
  printf("ret %i %i\n", ret, errno);
}

Change where it says PASTE 4K OF RANDOM DATA HERE to ca 4k of random characters. The hashbangscript doesn't care what the parameters are.

gcc repro.c to build the C file.

Then run a.out. It crashes in the cross-build side, but works on the host side:

$ cross-build-start
$ ./a.out
Error ... some filename does not exist

# Compare with:
$ cross-build-end
$ ./a.out
Hello from hashbang