vfork() not working? - Githubissues

th-otto commented 7 months ago

The following example:

#include <unistd.h>
#include <stdio.h>

int main(void)
{
    int i;
    pid_t p;

    p = vfork();
    if (p == 0)
    {
        for (i = 0; i < 10; i++)
        {
            printf("%d: in child\n", i);
            usleep(500000);
        }
    } else if (p == -1)
    {
        printf("cannot fork\n");
    } else
    {
        for (i = 0; i < 10; i++)
        {
            printf("%d: in parent\n", i);
            usleep(500000);
        }
    }
    return 0;
}

prints out

0: in child
1: in child
2: in child
3: in child
4: in child
5: in child
6: in child
7: in child
8: in child
9: in child

(as expected), but then crashes when the child exits.

Replacing vfork() with fork() works as expected (interleaving messages from the child & parent)

Edit: it also crashes (with illegal instruction) when calling Pvfork() directly, so maybe the problem is in the kernel, not in mintlib?

th-otto commented 7 months ago

Addtional notes: it works when i add calls to either _exit() or directly Pterm0() in the child. However when using exit() it also terminates the parent.

I still wonder why it works for linux (and also cygwin) without that call.

th-otto commented 5 months ago

Another note: i recently tried to add a fix in https://github.com/freemint/mintlib/blob/a50606d55035ced7c12ba1784a2e504ec310fbf5/unix/vfork.S#L26-L28 by pushing a1 around the Pvfork call (a1 is used there to return to the caller), on the assumption that a1 might be clobbered by the gemdos call. That fix turned out not to work: apparently, in the child garbage is popped from the stack when Pvfork returns. But that would also mean atleast that the raw gemdos binding for Pvfork in <mint/mintbind.h> is wrong, because it declares d1-d2/a0-a2 as clobbered like all other gemdos calls, causing gcc to push/pop those registers around the call.

Now trying to understand what exactly is going on in the kernel...

th-otto commented 5 months ago

Hm, ok, i think i now whats going on. In the above example, when the child simply return, it will return to libc_main(). That will in turn call exit(). There are two problems with this:

exit() will run atexit() handlers, and c++ desctructors. Since the memory is shared between the child and the parent, that will shutdown eg. stdio stream also for the parent. It also has the bad effect that the same exit handlers are run again when the parent exits.
but what is even more serious: the shared memory also includes the stack. Since the child returns, the call to exit() will smash the stack also for the parent. When vfork finally returns to the parent, that will jump into nirwana.

Unfortunately i have no idea how to fix that. Theoretically, we would have to ensure that the child never returns. But that can obviously not be done in the vfork() function itself, it must be done in the caller. But i don't know how to achieve that. I also don't know why the same code works in linux, which does imho something similar.

vinriviere commented 5 months ago

I found some clue in the vfork() manual:

The child must not return from the current function or call exit(3) (which would have the effect of calling exit handlers established by the parent process and flushing the parent's stdio(3) buffers), but may call _exit(2).

th-otto commented 5 months ago

Yes, thanks. So there is neither a bug in the kernel, nor in mintlib, just in my example ;) The function that does the vfork() simply must not return. What remains is the issue with calling Pvfork() directly. I think what happens there is similar. The generate code will roughly look like

        move.l %a2,-(%sp)
        move.l %d2,-(%sp)
        movw    #275,%sp@-
        trap    #1
        addql   #2,%sp
        move.l (%sp)+,%d2
        move.l (%sp)+,%a2

Now after the call, only the child will run until it exits or calls Pexec(). But the first thing it does is pop the two registers from the stack. Any other function call will then overwrite those two slots on the stack. Then when the child exits, the parent will restore garbage when it itself pops the two registers. So the definiton of Pvfork() in mintbind.h must not use a clobber list to prevent that. Another problem is that the current implementation of vfork() in mintlib only works as long as a1 is not clobbered during the call. This is the case when that call directly goes to mint, but may not be true if some TSR is intercepting the gemdos trap.

mikrosk commented 5 months ago

Then when the child exits, the parent will restore garbage when it itself pops the two registers. So the definiton of Pvfork() in mintbind.h must not use a clobber list to prevent that.

Isn't possible that now gcc will count on the fact that the registers will not be modified and will try to use them after the call?

th-otto commented 5 months ago

They are not modified. Otherwise the implementation in vfork.S (which uses a1) would not work, either.

What should be checked though, whether we have to implement it as a function instead of a expression type macro, so that we can declare it with attribute returns_twice. That would also apply to Pfork().

mikrosk commented 5 months ago

I guess the only reason why the macros are used instead of inline functions is speed. In this case it shouldn't matter.

th-otto commented 5 months ago

Using static inline functions (maybe with attribute always-inline if you want to be safe) should actually generate the same code.

freemint / mintlib

vfork() not working? #66