bitwiseworks / libc

LIBC Next (kLIBC fork)
9 stars 4 forks source link

Pass long command line and environment via shared memory #127

Closed dmik closed 1 year ago

dmik commented 1 year ago

It is known that DosExecPgm has a limit on the total length of the argument strings for the child process roughly equal to 32K. I believe a similar limit exists on the array of environment strings as well. It's time to solve this problem by passing both arrays via shared memory if a child is also a LIBCn process (which is true in 95% cases). This will solve many-many problems when porting stuff from Unix/Linux (where this limit has lifted since long), e.g. the one in GNU make, see e.g. https://github.com/bitwiseworks/kbuild-os2/issues/4.

The solution is rather trivial: I will allocate a block of memory to place both the arguments and the environment there and pass its address as the (only) DosExecPgm argument if the total size is more than 16K (just to be on the safe side — there are reports that if the argument/environment size is close to 32K, including inability to start new processes at all).

dmik commented 1 year ago

BTW, I modified the test cases from the above ticket so that they also test the environment and it turns out that my guess was right. The maximum environment size is the same as for the total argument size — 32K. Which suggests that the OS/2 kernel allocates a single memory segment (64K) for both the environment and arguments (that follow it in the segment). The pEnv and pCommandLine pointers from the FIB struct prove that as well. (I will attach the test cases later).

Once the environment size is exceeded, spawn returns EINVAL — the same way it does when the argument size is exceeded. This comes from DosExePgm's ERROR_BAD_ENVIRONMENT (10).

Note that when arguments get exceeded, DosExecPgm returns ERROR_BAD_ARGUMENTS (160), which is also converted to EINVAL.

StevenLevine commented 1 year ago

This is a holdover from the days when the kernel was 16-bit. A non-trivial percentage of the kernel is still 16-bit code.

dmik commented 1 year ago

I've implemented argument passing and it works like charm but there is a nasty show stopper. The internal struct _new_proc used to pass spawn data to its workhorse __spawnve unfortunately defines arg_size and env_size as unsigned short which effectively limits the total size of arguments and environment to 64K each. Not that much actually (compared to current 32K), I'm pretty sure there are linux command lines longer than that. And more over, the new approach allows to pass arguments limited only by the size of the largest currently available shared memory block (which is way more than 64K obviously).

I need to check if it's safe to enhance _new_proc.

dmik commented 1 year ago

BTW two more details. There is Posix ARG_MAX which should specify the maximum size of both args and environment supported by the platform (see here) which is currently set to 0x7fe0 (32K - 16). This is not accurate because on OS/2 the maximum for the sum of args and environment is actually 64K. However, DosExecPgm does not support spending more than 32K for one half if the other half is smaller — it will fail if either of the halves is greater than 32K regardless of the other one.

So I think I will just leave it as is and hope for ones who want details to discover this ticket. Note also that this limit (32K + 32K) is still in force when LIBC starts a non-LIBC (or pre-LIBC 0.6.4) executable because we can't use shared memory in such cases. So having ARG_MAX low still makes sense.

Another thing is that there is actually a special posix errno when ARG_MAX limit is exceeded — E2BIG. I don't know why Knut didn't do that but I will fix the OS/2 error conversion routine to do that for errors ERROR_BAD_ENVIRONMENT (10) and ERROR_BAD_ARGUMENTS (160).

psmedley commented 1 year ago

This sounds really cool @dmik - particularly the use of E2BUG. the nodejs build returning EINVAL confused the hell out of me. Hopefully this means we don't need to hack things to use response files to build nodejs. I'll check once this is committed.

dmik commented 1 year ago

@psmedley Thanks, yes, that's the idea. Although I ported kBuild to the latest version to get GNU make 4 and its $(file) function, I was still not satisfied with the result (of needing to change the makefiles in NodeJS and the way how $(file) works in the end). Also I didn't like the @rsp hack for GNU make by KOMH so I decided to finally implement it in LIBC.

komh commented 1 year ago

Oh, good job!

And it would be better to support @rsp method for arguments as a fallback for the non-LIBC programs, because even if non-LIBC programs, they support @rsp method in many cases.

dmik commented 1 year ago

Large environment handling is also done. However, doing that I discovered another problem. Argument handling that discovers that there is a special shared memory argument is done from the EXE, after the LIBC DLL gets initialised. And this means that this happens after LIBC actually constructs C environ array. However when we use shared memory for the environment and arguments, DosExecPgm doesn't get any environment at all (itl comes to the shared memory block). So, environ will be empty too during LIBC initialisation. And since LIBC initialisation depends on some variables (e.g. LANG, TZ or LIBC_LOGGING), we get it configured as if these variables were deleted from the environment (although they may be present in the environment passed in the shared memory block). This may result into an incorrect configuration.

There is only one solution here: process the command line (at least partly, to extract the shared memory block from there, if there is one) from LIBC itself, when it is about to initialise environ. I will have to slightly redo my implementation.

@komh thanks! However, regarding @rsp, I'm not that sure it is really necessary. Which non-LIBC programs are you referring here? I don't think there is a lot of them. And they must be native OS/2 apps that don't expect large command lines. And supplying @rsp to any non-LIBC app without knowing if it supports it seems like a wrong approach to me as it may lead to completely unexpected results.

komh commented 1 year ago

Watcom, VAC++, OS/2 toolkits and so on.

If arguments are too long, it's not possible to execute a program at all without special methods. If @rsp is passed, a program get a chance. However, if it cannot handle @rsp argument, then it will complain about it. In worst case, it treats @rsp as a normal file.

So, it's worth doing it.

dmik commented 1 year ago

Building the environment early is done and works for normal C code (e.g. main) but it didn't actually help because LIBC uses DosScanEnv directly at some places. I guess I will just replace is with getenv. It should solve the problem.

@komh, I still somehow don't like the idea without knowing a way to detect from within LIBC if the EXE supports @rsp or not. Besides, do you have any real life cases where you need to pass Watcom or ICC such a long argument line? Native projects have direct response file support and for the Unix world these tools are useless... And more over, it's a bit of a hassle to implement this in LIBC as there is no easy way to delete the file in case of failures.

dmik commented 1 year ago

Finally, the commit is in (it's a rather big change in many places). I will give it more local testing in "prod" mode before releasing. Everyone else is encouraged to test as well.

Remaining problems:

  1. The length of argv[0] (i.e. the first argument - a command name, by convention) is not actually accounted for when calculating if the size exceeds 32K. This is because this argument is given special processing in both the spawn path and the startup path in kLIBC. A possible problem here is that if the first argument is larger than 32K, it won't trigger the shared memory mechanism and spawn will just fail with E2BIG. However, this is very unlikely because an EXE file name can't be bigger than 260 chars (PATH_MAX) on OS/2 anyway — so the error should pop up much earlier when checking the file name etc. (since usually argv[0] is just a copy of the exe path passed to spawn). However, it's still possible to screw it deliberately.

  2. Executing bash scripts via a processor specified in #! on the first line doesn't use this extension so far. This place of code in kLIBC is really cumbersome so it requires additional work to support this scenario.

I will test what I got first and may be will do the above two as well before releasing.

dmik commented 1 year ago

There seems to be some regression in the last commit. git barfs like this here:

D:>git difftool .
sh: 0: Can't open D:/Coding/git/master-run/libexec/git-core\git-difftool--helper.�deps/v8/src/codegen/interface-descriptors.cc
[0600:01] fatal: external diff died, stopping at deps/v8/src/codegen/interface-descriptors.cc

Some parameters are messed up in hash bang script mode.

dmik commented 1 year ago

Found yet another regression of 9bce44d. Due to some arithmetic error the trailing part of the last argument (as long as the executable file name) is cut off when using shared memory but not in hash bang mode.

As a result, a NodeJS build fails like that when running a really long rm & ar command line with a bunch of object files (more than 90k bytes long)

~/Network/novator_C/usr/bin/sh.exe: 1: Syntax error: Unterminated quoted string

This is because the trailing quote was cut off.

dmik commented 1 year ago

And there is one more (minor) problem. It is legal to start programs with no arguments provided — i.e. when the empty argv vector containing only one NULL pointer is given to spawn/exec (note that NULL as argv is not allowed and the process will simply crash in this case). The current implementation will make the started child see the special \x7fkLIBC\7f argument injected for special kLIBC processing in such a case but it should actually not be seen (argc should be 0 in child as well as argv should be an empty vector). Should be easy to fix too.