oils-for-unix / oils

Oils is our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!
http://www.oilshell.org/
Other
2.83k stars 155 forks source link

xargs -P could use Oil's core/process.py abstractions #326

Open andychu opened 5 years ago

andychu commented 5 years ago

This issue is a continuation of #85 . Also related to #198 .

It should be in C because dealing with signals in Python makes it even more complicated. We now have a nice front end thanks to @drwilly, and the front end and back end are clearly separable.

The issue being discussed there is signal coalescing. xargs needs wait for multiple signals at a time:

Note that GNU xargs uses sig_atomic_t, which is just a typedef?

https://stackoverflow.com/questions/24931456/how-does-sig-atomic-t-actually-work

andychu commented 5 years ago

Notes/Questions:

andychu commented 5 years ago

Notes on GNU xargs:

  1. The whole executionhappens in an atexit() handler! Weird.
  if (atexit (close_stdin) || atexit (wait_for_proc_all))                                                                                                                                                                                     
    {                                                                                                                                                                                                                                         
      error (EXIT_FAILURE, errno, _("The atexit library function failed"));                                                                                                                                                                   
    }      
  1. It uses the default SIGCHLD.
  /* Make sure to listen for the kids.  */                                                                                                                                                                                                    
  signal (SIGCHLD, SIG_DFL);                                                                                                                                                                                                                  

OSH uses the default too. So maybe we don't need an event loop model?

Maybe we can reuse waiter.Wait() in core/process.py ? It gives you a callback with (pid, status).

  1. SIGUSR1 and SIGUSR2 just mutate global sig_atomic_t variables in the signal handler. If we want a "user interface" this might not be enough. We might want to do stuff in the main loop rather than in the signal handler.
  sigact.sa_handler = increment_proc_max;                                                                                                                                                                                                     
  sigemptyset(&sigact.sa_mask);                                                                                                                                                                                                               
  sigact.sa_flags = 0;                                                                                                                                                                                                                        
  if (0 != sigaction (SIGUSR1, &sigact, (struct sigaction *)NULL))                                                                                                                                                                            
          error (0, errno, _("Cannot set SIGUSR1 signal handler"));    
andychu commented 5 years ago

Hm yeah I think this can all be done in Python -- SIGUSR1, SIGUSR2, timer with possible UI, and handling child exit. Waiter seems good enough to reuse.

It returns False once there are no more processes to wait for. As long as we check if there are remaining processes to run, that should be race-safe.

So the algorithm is:

  1. Start P processes
  2. Enter a loop with waiter.Wait(), which gets unblocked every time a child exits. Oops well we also have to get unblocked on SIGUSR{1,2}.
  3. Exit when waiter.Wait() returns False (os.wait() returns errno.ECHILD) and there are no processes left to run.

Hm it does seem like we want the self-pipe trick... because we are waiting on multiple signals at a time. There are two conditions when you start a process:

  1. When one finishes
  2. When you get SIGUSR1

Does GNU xargs handle this correctly?


Also xargs writes the errno of exec() to a pipe from the child to the parent. I don't really understand why they need to do that!

                /* Write errno value to parent.  We do this even if                                                                                                                                                                           
                 * the error was not E2BIG, because we need to                                                                                                                                                                                
                 * distinguish successful execs from unsuccessful                                                                                                                                                                             
                 * ones.  The reason we need to do this is to know                                                                                                                                                                            
                 * whether to reap the child here (preventing the                                                                                                                                                                             
                 * exit status processing in wait_for_proc () from                                                                                                                                                                            
                 * changing the value of child_error) or leave it                                                                                                                                                                             
                 * for wait_for_proc () to handle.  We need to have                                                                                                                                                                           
                 * wait_for_proc () handle the exit values from the                                                                                                                                                                           
                 * utility if we run it, for POSIX compliance on the                                                                                                                                                                          
                 * handling of exit values.                                                                                                                                                                                                   
                 */                                                                                                                                                                                                                           
                write (fd[1], &errno, sizeof (int));      
andychu commented 5 years ago

Hm actually I think waiter.Wait() needs *three return values instead of 2:

  1. True -- callback for a process was called, caller should keep waiting
  2. False -- we're done with everything
  3. a third value -- we were interupted by another signal (wait() returns EINTR). Right now there is an internal continue in this case. That is, waiter.Wait() calls posix.wait() multiple times.

Although if we want the UI, it might be better to take SIGUSR{1,2} over a pipe. So then we could also write a character to the pipe from a different source. It's more modular.

andychu commented 3 years ago

862 hm this is somewhat of a dupe