netblue30 / firejail

Linux namespaces and seccomp-bpf sandbox
https://firejail.wordpress.com
GNU General Public License v2.0
5.8k stars 567 forks source link

Higher argument limits? (Error: too many arguments) #4633

Open Artefact2 opened 3 years ago

Artefact2 commented 3 years ago

It seems that firejail has much lower limits for the number of arguments it supports (or the total size of argv), compared to other standard programs. This makes using firejail with long file lists a painful process, as standard tools like find -exec or xargs won't work out of the box.

Here's a simple way to reproduce:

% touch $(seq 1000 2000)
% /usr/bin/echo ./*
(works)
% firejail --noprofile /usr/bin/echo ./*
Error: too many arguments
% ls --zero | xargs -0 /usr/bin/echo
(works)
% ls --zero | xargs -0 firejail --noprofile /usr/bin/echo
Error: too many arguments
% find . -exec /usr/bin/echo {} +
(works)
% find . -exec firejail --noprofile /usr/bin/echo {} +
Error: too many arguments

For additionnal reference:

% echo $SHELL
/bin/zsh
% firejail --version
firejail version 0.9.66
% xargs --show-limits
Your environment variables take up 1198 bytes
POSIX upper limit on argument length (this system): 2502482
POSIX smallest allowable upper limit on argument length (all systems): 4096
Maximum length of command we could actually use: 2501284
Size of command buffer we are actually using: 131072
Maximum parallelism (--max-procs must be no greater): 2147483647
kmk3 commented 3 years ago

@Artefact2 commented on Oct 23:

It seems that firejail has much lower limits for the number of arguments it supports (or the total size of argv), compared to other standard programs. This makes using firejail with long file lists a painful process, as standard tools like find -exec or xargs won't work out of the box.

Here's a simple way to reproduce:

% touch $(seq 1000 2000)
% /usr/bin/echo ./*
(works)
% firejail --noprofile /usr/bin/echo ./*
Error: too many arguments
% ls --zero | xargs -0 /usr/bin/echo
(works)
% ls --zero | xargs -0 firejail --noprofile /usr/bin/echo
Error: too many arguments
% find . -exec /usr/bin/echo {} +
(works)
% find . -exec firejail --noprofile /usr/bin/echo {} +
Error: too many arguments

For additionnal reference:

% echo $SHELL
/bin/zsh
% firejail --version
firejail version 0.9.66
% xargs --show-limits
Your environment variables take up 1198 bytes
POSIX upper limit on argument length (this system): 2502482
POSIX smallest allowable upper limit on argument length (all systems): 4096
Maximum length of command we could actually use: 2501284
Size of command buffer we are actually using: 131072
Maximum parallelism (--max-procs must be no greater): 2147483647

Current limit:

#define MAX_ARGS 128      // maximum number of command arguments (argc)
#define MAX_ARG_LEN (PATH_MAX + 32) // --foobar=PATH
extern char *fullargv[MAX_ARGS];
extern int fullargc;

That's been the case from the beginning it seems:

Main usage:

            int j;
            for (i = 1, j = fullargc; i < argc && j < MAX_ARGS; i++, j++, fullargc++)
                fullargv[j] = argv[i];

            // replace argc/argv with fullargc/fullargv
            argv = fullargv;
            argc = j;

@netblue30 Is there any specific reason for this limit?

It seems that something like the following could be done instead:

fullargv = malloc(strlen(argv) + 1);

Anyway, I'm currently already working on 2 other things, so if anyone wants to take it, feel free to do so.

rusty-snake commented 3 years ago

Even if you would use malloc:

https://github.com/netblue30/firejail/blob/efbf74e12421c97d8a1756649422f83f4a0b7e50/src/firejail/main.c#L1005-L1008

kmk3 commented 3 years ago

@rusty-snake commented on Oct 23:

Even if you would use malloc:

https://github.com/netblue30/firejail/blob/efbf74e12421c97d8a1756649422f83f4a0b7e50/src/firejail/main.c#L1005-L1008

MAX_ARGS can be replaced with ARG_MAX.

Relates to #4583.

kmk3 commented 3 years ago

On second thought, I don't know why a limit on argc would be needed (there is none defined by POSIX at least); I think it would only matter on argv, because of e.g.: memory usage.

topimiettinen commented 3 years ago

On second thought, I don't know why a limit on argc would be needed (there is none defined by POSIX at least); I think it would only matter on argv, because of e.g.: memory usage.

It's a security feature: argv (and environment) could be used to fill the stack area so that stack smashing would be easier. Similar check exists for environment variables too. Raising the limit to ARG_MAX (524288) wouldn't be a good idea, but 128 could be raised a bit higher.

In general, portability to POSIX isn't very interesting for Firejail, since it depends on Linux-only features like mount namespaces, seccomp and capabilities. Without them Firejail would be useless. I don't think BSDs have anything comparable.

kmk3 commented 3 years ago

@topimiettinen commented on Oct 23:

On second thought, I don't know why a limit on argc would be needed (there is none defined by POSIX at least); I think it would only matter on argv, because of e.g.: memory usage.

It's a security feature: argv (and environment) could be used to fill the stack area so that stack smashing would be easier. Similar check exists for environment variables too. Raising the limit to ARG_MAX (524288) wouldn't be a good idea, but 128 could be raised a bit higher.

I see, that makes sense. But still, wouldn't it make more sense to check only for argv (regardless of the specific upper bound number that would be chosen) rather than argc?

In general, portability to POSIX isn't very interesting for Firejail, since it depends on Linux-only features like mount namespaces, seccomp and capabilities. Without them Firejail would be useless. I don't think BSDs have anything comparable.

Never thought I'd get an appropriate occasion to do it, but:

I'd just like to interject for a moment. What you're referring to as Linux, is in fact, GNU/Linux, or as I've recently taken to calling it, GNU plus Linux. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX.

The examples listed are kernel-level features, but limits.h is system-defined and thus can be overridden by e.g.: other OS components. #4583 is a great example of that, as glibc overrides what Linux defines in limits.h, but musl doesn't. So keeping POSIX in mind in this case could help with libc portability (rather than kernel portability). There is also #4293, which was intended to make the configure script portable between different POSIX-compliant shells (e.g.: bash and dash).

Anyway, POSIX was just an example in my previous comment; it's the only thing that I know that explicitly defines such limits. What I meant is that Linux/glibc/musl define related limits using the macro names specified by POSIX, but neither Linux/glibc/musl nor POSIX define any limits for argc AFAIK.

topimiettinen commented 3 years ago

@topimiettinen commented on Oct 23:

On second thought, I don't know why a limit on argc would be needed (there is none defined by POSIX at least); I think it would only matter on argv, because of e.g.: memory usage.

It's a security feature: argv (and environment) could be used to fill the stack area so that stack smashing would be easier. Similar check exists for environment variables too. Raising the limit to ARG_MAX (524288) wouldn't be a good idea, but 128 could be raised a bit higher.

I see, that makes sense. But still, wouldn't it make more sense to check only for argv (regardless of the specific upper bound number that would be chosen) rather than argc?

The important thing is actually the size, or the sum of all arguments.

In general, portability to POSIX isn't very interesting for Firejail, since it depends on Linux-only features like mount namespaces, seccomp and capabilities. Without them Firejail would be useless. I don't think BSDs have anything comparable.

Never thought I'd get an appropriate occasion to do it, but:

I'd just like to interject for a moment. What you're referring to as Linux, is in fact, GNU/Linux, or as I've recently taken to calling it, GNU plus Linux. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX.

I think calling a system GNU is exaggerating, there are many other non-GNU components in the system too and the distros also put a lot of effort to integration. The most important piece these days is the browser, which isn't GNU and a lot of old UNIX-like GNU or POSIX stuff with text UIs isn't very cool anymore. It isn't fair calling the phone systems 'Android' either, but calling it 'Linux/Bionic' would be silly.

The examples listed are kernel-level features, but limits.h is system-defined and thus can be overridden by e.g.: other OS components. #4583 is a great example of that, as glibc overrides what Linux defines in limits.h, but musl doesn't. So keeping POSIX in mind in this case could help with libc portability (rather than kernel portability). There is also #4293, which was intended to make the configure script portable between different POSIX-compliant shells (e.g.: bash and dash).

Anyway, POSIX was just an example in my previous comment; it's the only thing that I know that explicitly defines such limits. What I meant is that Linux/glibc/musl define related limits using the macro names specified by POSIX, but neither Linux/glibc/musl nor POSIX define any limits for argc AFAIK.

Good points with portability to musl. Though does any distro use it for real?

reinerh commented 3 years ago

Good points with portability to musl. Though does any distro use it for real?

Alpine comes to mind. And many embedded distributions like OpenWrt.

kmk3 commented 3 years ago

@topimiettinen commented on Oct 23:

@topimiettinen commented on Oct 23:

On second thought, I don't know why a limit on argc would be needed (there is none defined by POSIX at least); I think it would only matter on argv, because of e.g.: memory usage.

It's a security feature: argv (and environment) could be used to fill the stack area so that stack smashing would be easier. Similar check exists for environment variables too. Raising the limit to ARG_MAX (524288) wouldn't be a good idea, but 128 could be raised a bit higher.

I see, that makes sense. But still, wouldn't it make more sense to check only for argv (regardless of the specific upper bound number that would be chosen) rather than argc?

The important thing is actually the size, or the sum of all arguments.

So something like this?

int i;
size_t total_argsz = argc;
for (i = 0; i < argc; i++) {
    total_argsz += strlen(argv[i]) + 1;

    if (total_argsz > SOME_ARG_MAX)
        // die
}

In general, portability to POSIX isn't very interesting for Firejail, since it depends on Linux-only features like mount namespaces, seccomp and capabilities. Without them Firejail would be useless. I don't think BSDs have anything comparable.

Never thought I'd get an appropriate occasion to do it, but:

I'd just like to interject for a moment. What you're referring to as Linux, is in fact, GNU/Linux, or as I've recently taken to calling it, GNU plus Linux. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX.

I think calling a system GNU is exaggerating, there are many other non-GNU components in the system too and the distros also put a lot of effort to integration. The most important piece these days is the browser, which isn't GNU and a lot of old UNIX-like GNU or POSIX stuff with text UIs isn't very cool anymore. It isn't fair calling the phone systems 'Android' either, but calling it 'Linux/Bionic' would be silly.

The copypasta was posted mostly in jest to point out that there is more to POSIX than the kernel-related functions and the C extensions. I think that the main point is to remind people that the shell, coreutils, make and the compiler are essential parts of the OS (as in, you couldn't even build the Linux kernel without them), which are also all specified by POSIX. You can have a "complete" (i.e.: self-bootstrapping) Linux software distribution (GNU or not) without a web browser, but probably not without the aforementioned tools. That is, everything else could be considered an application built on top of the OS, not the OS itself.

As for Android, it takes an absurd amount of disk space (in the realm of multiple GiB with all the tooling last I checked) just to build a normal application, let alone the entire system (it requires dozens of GiB AFAIK). And I'm not sure that all of the necessary tools are even available under FLOSS licenses, let alone packaged in their entirety on e.g.: Debian. And that is just to effectively cross-compile it from another system; I doubt it's even feasible to completely do it on a conventional Android device itself, let alone to be able to actually boot the result. AFAICT the unix-y portions of it are mostly buried under layers and layers of frameworks and abstractions. System-wise it is as far from a conventional Linux or BSD software distribution as it gets and should indeed probably just be considered its own (incomplete) thing.

Now, something similar could be said about the complexity of gcc/clang (either of which is currently required to build the kernel AFAIK), though there is work being done to enable smaller (and C-based) alternatives, but even disregarding that, I would consider gcc/clang to be at least more familiar beasts than the Java runtimes and what not.

The examples listed are kernel-level features, but limits.h is system-defined and thus can be overridden by e.g.: other OS components.

4583 is a great example of that, as glibc overrides what Linux defines in

limits.h, but musl doesn't. So keeping POSIX in mind in this case could help with libc portability (rather than kernel portability). There is also

4293, which was intended to make the configure script portable between

different POSIX-compliant shells (e.g.: bash and dash).

Anyway, POSIX was just an example in my previous comment; it's the only thing that I know that explicitly defines such limits. What I meant is that Linux/glibc/musl define related limits using the macro names specified by POSIX, but neither Linux/glibc/musl nor POSIX define any limits for argc AFAIK.

Good points with portability to musl. Though does any distro use it for real?

Off the top of my head, from the distros that either do or did package firejail (from README.md):

Alpine is known for being small and so is often used on Docker containers.

postmarketOS is based on Alpine and is intended for phones (and so is much closer to a conventional distro than Android).

Void is the one that officially supports the most BSD-y userland of the list that I know of. Examples: runit, seatd, sndio.

Also, the main KISS Linux build is musl-based, but I'm not aware of any firejail build recipe.

topimiettinen commented 3 years ago

@topimiettinen commented on Oct 23:

@topimiettinen commented on Oct 23:

On second thought, I don't know why a limit on argc would be needed (there is none defined by POSIX at least); I think it would only matter on argv, because of e.g.: memory usage.

It's a security feature: argv (and environment) could be used to fill the stack area so that stack smashing would be easier. Similar check exists for environment variables too. Raising the limit to ARG_MAX (524288) wouldn't be a good idea, but 128 could be raised a bit higher.

I see, that makes sense. But still, wouldn't it make more sense to check only for argv (regardless of the specific upper bound number that would be chosen) rather than argc?

The important thing is actually the size, or the sum of all arguments.

So something like this?

int i;
size_t total_argsz = argc;
for (i = 0; i < argc; i++) {
  total_argsz += strlen(argv[i]) + 1;

  if (total_argsz > SOME_ARG_MAX)
      // die
}

Yes. The calculation could also consider environment variables. Maybe simply (knowing the layout of argv/env/auxv in stack) just check that stack isn't used too much.

In general, portability to POSIX isn't very interesting for Firejail, since it depends on Linux-only features like mount namespaces, seccomp and capabilities. Without them Firejail would be useless. I don't think BSDs have anything comparable.

Never thought I'd get an appropriate occasion to do it, but: I'd just like to interject for a moment. What you're referring to as Linux, is in fact, GNU/Linux, or as I've recently taken to calling it, GNU plus Linux. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX.

I think calling a system GNU is exaggerating, there are many other non-GNU components in the system too and the distros also put a lot of effort to integration. The most important piece these days is the browser, which isn't GNU and a lot of old UNIX-like GNU or POSIX stuff with text UIs isn't very cool anymore. It isn't fair calling the phone systems 'Android' either, but calling it 'Linux/Bionic' would be silly.

The copypasta was posted mostly in jest to point out that there is more to POSIX than the kernel-related functions and the C extensions. I think that the main point is to remind people that the shell, coreutils, make and the compiler are essential parts of the OS (as in, you couldn't even build the Linux kernel without them), which are also all specified by POSIX. You can have a "complete" (i.e.: self-bootstrapping) Linux software distribution (GNU or not) without a web browser, but probably not without the aforementioned tools. That is, everything else could be considered an application built on top of the OS, not the OS itself.

We're approaching the point where shell and other utils aren't used very much for lauching apps in for the typical installation: kernel -> systemd -> sddm -> 'systemd --user' -> app (browser etc.). Also the development is done more and more in the cloud, like this GitHub. With vscode.dev it's probably possible to do all development, including editing, remotely.

With Meson build system, a shell isn't necessary when compiling and it's much saner (and faster) than Make. Perhaps Firejail should switch to Meson at some point.

I've actually long wanted a system, which would have two operating modes:

Requiring a reboot to switch modes or even switch Secure Boot on/off from UEFI BIOS could be OK. Maybe the production image could be even produced by a CI job far away.

As for Android, it takes an absurd amount of disk space (in the realm of multiple GiB with all the tooling last I checked) just to build a normal application, let alone the entire system (it requires dozens of GiB AFAIK). And I'm not sure that all of the necessary tools are even available under FLOSS licenses, let alone packaged in their entirety on e.g.: Debian. And that is just to effectively cross-compile it from another system; I doubt it's even feasible to completely do it on a conventional Android device itself, let alone to be able to actually boot the result. AFAICT the unix-y portions of it are mostly buried under layers and layers of frameworks and abstractions. System-wise it is as far from a conventional Linux or BSD software distribution as it gets and should indeed probably just be considered its own (incomplete) thing.

It's also very interesting from security point of view. Android uses a combination of UIDs and SELinux categories (or sensitivies) for each app. This makes a lot of sense, there's typically no need for app A to look at files for app B. Firejail helps a lot here by using mount namespaces and by rearranging the apps' views to the contents of $HOME, but it would be better if the OS also did something Android-like. There could be also snaps/appimages/flatpaks, but produced locally from distro packages with some distro-supported automatics.

Now, something similar could be said about the complexity of gcc/clang (either of which is currently required to build the kernel AFAIK), though there is work being done to enable smaller (and C-based) alternatives, but even disregarding that, I would consider gcc/clang to be at least more familiar beasts than the Java runtimes and what not.

The examples listed are kernel-level features, but limits.h is system-defined and thus can be overridden by e.g.: other OS components.

4583 is a great example of that, as glibc overrides what Linux defines in

limits.h, but musl doesn't. So keeping POSIX in mind in this case could help with libc portability (rather than kernel portability). There is also

4293, which was intended to make the configure script portable between

different POSIX-compliant shells (e.g.: bash and dash). Anyway, POSIX was just an example in my previous comment; it's the only thing that I know that explicitly defines such limits. What I meant is that Linux/glibc/musl define related limits using the macro names specified by POSIX, but neither Linux/glibc/musl nor POSIX define any limits for argc AFAIK.

Good points with portability to musl. Though does any distro use it for real?

Off the top of my head, from the distros that either do or did package firejail (from README.md):

* Alpine Linux (musl)

* Void Linux (either glibc or musl)

Alpine is known for being small and so is often used on Docker containers.

postmarketOS is based on Alpine and is intended for phones (and so is much closer to a conventional distro than Android).

Void is the one that officially supports the most BSD-y userland of the list that I know of. Examples: runit, seatd, sndio.

Also, the main KISS Linux build is musl-based, but I'm not aware of any firejail build recipe.

Thanks. On Debian, musl is available, but it can't be used for pretty much anything, so I thought it's not ready for production.