haskell / unix

POSIX functionality
https://hackage.haskell.org/package/unix
Other
107 stars 92 forks source link

forkProcess fails with +RTS -N #62

Open jbaum98 opened 8 years ago

jbaum98 commented 8 years ago

I have a program in which I use forkProcess. When it is run with -threaded +RTS -N1, it works fine, but when I use -threaded +RTS -N2 I get this error:

failed to create OS thread: Resource temporarily unavailable

I haven't been able to reproduce this issue in a small test case, only within the larger program. To provide some context though, it' called only from the following function:

pOpen :: FilePath -> [String] -> (Handle -> IO a) -> IO a
pOpen fp args func = do
  pipepair <- createPipe
  pid <- do
    p <- Control.Exception.try (forkProcess $ childstuff pipepair)
    case p of
      Right x -> return x
      Left (e :: Control.Exception.IOException) -> fail ("Error in fork: " ++ show e)
  retval <- callfunc pipepair
  let rv = seq retval retval
  void $ getProcessStatus True False pid
  return rv
  where
    callfunc pipepair = do
      closeFd (fst pipepair)
      h <- fdToHandle (snd pipepair)
      x <- func h
      hClose h
      return $! x

    childstuff pipepair = do
              void $ dupTo (fst pipepair) stdInput
              closeFd $ fst pipepair
              closeFd $ snd pipepair
              executeFile fp True args Nothing

GHC Version: 7.10.3 unix version: 2.7.1.0 OS: Ubuntu 16.04

jbaum98 commented 8 years ago

In versions around 2.4, it mentioned this bug in the documentation. For example in this commit: 9f0c96ed1eb9eda5e74f13acf1ddc3ab523b0bd4 But later on in this commit 05eea1ea715745d4e2086d that warning was removed.

xtendo-org commented 7 years ago

I confirm that this indeed happens.

Running programs that contain forkProcess with +RTS -N2 causes:

failed to create OS thread: Resource temporarily unavailable

This is not deterministic; sometimes it happens, sometimes it doesn't.

As mentioned in 9f0c96e, shared resources also stumble. I try to create pipes with createPipe, but if I use dupTo to these pipe FDs in the child process, I occasionally get:

dupTo: resource busy (Device or resource busy)

I'd be very grateful if anyone can look into this.

hvr commented 7 years ago

since it was @simonmar who removed that comment, he may know more about...

simonmar commented 7 years ago

It looks like this error will be produced if pthread_create returns EAGAIN. According to the man page on my system (Linux 4.6):

       EAGAIN Insufficient resources to create another thread.

       EAGAIN A  system-imposed  limit  on the number of threads was encountered.  There are a number of limits
              that may trigger this error: the RLIMIT_NPROC soft resource limit (set via  setrlimit(2)),  which
              limits  the number of processes and threads for a real user ID, was reached; the kernel's system-
              wide limit on the number of processes and threads, /proc/sys/kernel/threads-max, was reached (see
              proc(5)); or the maximum number of PIDs, /proc/sys/kernel/pid_max, was reached (see proc(5)).

Could either of those be the case?

xtendo-org commented 7 years ago

@simonmar I think it's not because we're reaching any kind of system-imposed limit, because the program has no problem with +RTS -N1. I've tested with a program that forks 100 processes with a I/O-bound work, and forking 100 times will cause no trouble as long as there is no OS-thread switching in the program. The error messages like failed to create OS thread: Resource temporarily unavailable or dupTo: resource busy (Device or resource busy) would only happen with +RTS -N4 (or any number higher than 1).

Doing the work of forking and pipe duping in the FFI actually solves this problem. This is how the callProcess family of the process package deals with the issue. @snoyberg pointed this out in this Reddit comment, so maybe he can give us some hint?

I'll try to come up with a smaller example to reproduce this when I have the time.

ixmatus commented 3 years ago

I believe I've hit the same issue except I don't see any exceptions, forkProcess simply doesn't return. Furthermore, I don't see this problem on Darwin but I do on Linux and setting -N1 fixes the problem.

hasufell commented 2 years ago

I don't know enough about RTS to look into this. However, a proper reproducer would be a start.

Until then, maybe we should just restore the warning from https://github.com/haskell/unix/commit/05eea1ea715745d4e2086d3b25a14f35f424045c (or some variation of it) @hs-viktor ?

hs-viktor commented 2 years ago

A reproducer would indeed be helpful. The immediate observations are:

  1. Threads and forking mix poorly. Pick one concurrency model. It is very difficult to portably and correctly fork a multi-threaded process.
  2. In general there is no correct way to do this. The other threads in the caller sometimes need to continue to run in the child, and sometimes must not.
  3. To the extent that there's an issue to solve here, it would in GHC's rts/Schedule.c

I guess we could warn users to avoid this function in multi-threaded programs, beyond that, it is not clear what unix can do here.