ndmitchell / shake

Shake build system
http://shakebuild.com
Other
772 stars 118 forks source link

Question: Why does `General.Pool.step` call `worker` (and hence run two tasks per thread)? #823

Open mpickering opened 2 years ago

mpickering commented 2 years ago

When there is more work to do step spawns a new thread and runs the action new >> worker pool. I can't understand what exactly is going on with the worker pool call.

How I understand the code sequence:

  1. New thread if spawned which initially runs the action (now)
  2. worker pool runs, which calls withPool, and so will block until step finishes (because the thread is forked whilst step owns the Pool variable.
  3. step finishes and the spawned thread is added to the pool.
  4. This unblocks worker pool, which then unqueues another task? (This is the part I don't get) and runs it.
  5. Thread finishes and the spawned thread is removed from the Pool.

Step 4 is the part I don't understand, why does each spawned thread get two actions run in it?

Code in question:

worker :: Pool -> IO ()
worker pool = withPool pool $ \s -> pure $ case Heap.uncons $ todo s of
    Nothing -> (s, pure ())
    Just (Heap.Entry _ (_, now), todo2) -> (s{todo = todo2}, now >> worker pool)

-- | Given a pool, and a function that breaks the S invariants, restore them.
--   They are only allowed to touch threadsLimit or todo.
--   Assumes only requires spawning a most one job (e.g. can't increase the pool by more than one at a time)
step :: Pool -> (S -> IO S) -> IO ()
-- mask_ is so we don't spawn and not record it
step pool@(Pool _ done) op = mask_ $ withPool_ pool $ \s -> do
    s <- op s
    case Heap.uncons $ todo s of
        Just (Heap.Entry _ now, todo2) | threadsCount s < threadsLimit s -> do
            -- spawn a new worker
            t <- newThreadFinally (now >> worker pool) $ \t res -> case res of
                Left e -> withPool_ pool $ \s -> do
                    signalBarrier done $ Left e
                    pure (remThread t s){alive = False}
                Right _ ->
                    step pool $ pure . remThread t
            pure (addThread t s){todo = todo2}
ndmitchell commented 2 years ago

The reasoning for step spawning a new thread was to free up any used stack that might be on the thread already, and to make sure the exceptions didn't have any chance of bubbling up. Whether those are legitimate reasons is probably something you are most qualified to answer than me! Does that sound sensible or silly?

mpickering commented 2 years ago

@ndmitchell Spawning a new thread is fine but the question is why two actions are run in the spawned thread rather than one thread per action.