joeyadams / hs-windows-iocp

Playground for implementing interruptible I/O on Windows using completion ports (rewrite of my haskell-iocp)
BSD 3-Clause "New" or "Revised" License
4 stars 0 forks source link

Hey! #1

Open dmjio opened 10 years ago

dmjio commented 10 years ago

Glad to see someone working on this. I'm going to need to use haskell on windows and will be doing a significant amount of IO. I saw your ticket here. Wanted to know if there's anything I can do to help you.

https://ghc.haskell.org/trac/ghc/ticket/7353

joeyadams commented 10 years ago

I appreciate your interest, but I didn't have time to finish my work on Windows I/O. I resorted to simple hacks to work around the issues I encountered (namely, I couldn't interrupt network I/O system calls). I now work at a .NET shop, so I currently have very little time to work on this.

My workaround was to use socket timeouts (see my network-socket-options package on Hackage). In one case, I had to make custom modifications to a library, since it didn't expose the underlying socket or handle.

Some difficulties:

It would be great to see good Haskell I/O support for Windows, but plan on several months of effort to get things fixed in GHC and the network package. I can help you understand the I/O manager design, but I probably won't have time to contribute code.

On Sat, Apr 26, 2014 at 12:11 AM, David Johnson notifications@github.comwrote:

Hey! Glad to see someone working on this. I'm going to need to use haskell on windows and will be doing a significant amount of IO. I saw your ticket here. Wanted to know if there's anything I can do to help you.

https://ghc.haskell.org/trac/ghc/ticket/7353

— Reply to this email directly or view it on GitHubhttps://github.com/joeyadams/hs-windows-iocp/issues/1 .

dmjio commented 10 years ago

@joeyadams, sorry I haven't responded sooner. Thanks a ton for such a detailed description of the situation. I think I initially thought that _no_ I/O operations on Windows were interruptible (that really worried me), but it seems from what you've described it's just network I/O (which is still very significant). What I'm trying to do is a little unique. I have a 10-core machine that needs to run as many versions of a single-threaded, RAM-guzzling process as it can on Windows 7 x64. I have one haskell worker process that forks 10 threads to spawn sub processes (C# applications) that spawn and interact with yet another process via its API, then shut it down, and return the result to the haskell process via stderr / stdout file descriptors. A simplified example might look like this, where action would perform a lot of network I/O (communicate with S3, a remote acid-state, etc), along with calling runInteractiveProcess to spawn the C# app.

There really isn't any synchronization that needs to occur between threads outside of the main thread killing a child thread. (Until I implement logging with Chans, but that shouldn't be an issue) My biggest concern is that if a thread becomes unresponsive with a BlockedThread status , can it be killed?

Here's a mock implementation of what I'm doing.

doWork :: IO () -> IO [ThreadId]
doWork action = 
    do caps <- getNumCapabilities
       forM [1..caps] $ \_ -> 
              forkIO $ forever $ do { print =<< myThreadId; action }

action :: IO ()
action = do 
      acid <- openRemote -- network I/O
      qMsg{..} <- query acid ReadQueue
      -- .... downloads a file from s3 
      -- .. runInteractiveProcess on C# app.. get result via stderr / stdout
      -- .. upload result to S3, send email ... more network I/O, etc.

main :: IO ()
main = do 
    threadIds <- doWork action
    flip evalStateT threadIds $ forever $
          do tids <- get
               forM_ tids $ \threadId ->
                     do status <- liftIO $ threadStatus threadId
                          case status of 
                              ThreadDied -> do 
                                      modify (delete threadId)  
                                      liftIO $ killThread threadId
                                      newTid <- forkIO action
                                      modify (newTid:) 
                                      threadDelay $ seconds 1
                                ThreadBlocked reason -> case reason of
                                           BlockedOnException -> 
              --- CAN IT BE KILLED HERE?, since it's an exception network I/O has abruptly ended right,     -- so there shouldn't be a problem. 
-- Since I'm running w/ -threaded option, will this even occur?

seconds :: Int -> Int
seconds = (*100000)

I'd like things to be fast, but this isn't a financial trading application. Microsecond differences are fine. At times it might be preferable to kill a thread even if its status is ThreadRunning and network I/O could be happening. It seems given what you said earlier the only thing I can do is patch acid-state and the conduits libraries to make sure they're exposing the correct sockets so your library can take advantage of that and set timeouts (correct me if I'm wrong).

I would be interested in learning more. I don't want to take too much of your time, maybe you could point me to some required reading before we get into it. I'd like to know what you mean by:

Scalable I/O on Windows is based on completion queues, not file descriptor polling.

You mentioned, The Windows I/O completion port API isn't designed with light-weight threads in mind, but Windows Vista introduced fixes to some of the issues (like being able to cancel I/O started from another OS thread). Does this mean Windows isn't using green threads?

GHC's deep internals are unchartered territory for me, but I want to get into it.

joeyadams commented 10 years ago

On Wed, Apr 30, 2014 at 6:15 PM, David Johnson notifications@github.comwrote:

I think I initially thought that no I/O operations on Windows were interruptible (that really worried me), but it seems from what you've described it's just network I/O (which is still very significant).

I may be wrong, but with GHC, file I/O isn't interruptible on Linux or Windows. But this shouldn't be a big deal, since file I/O typically doesn't block for a long time. What normally blocks for a long time is network or pipe I/O that waits for something, like recv or accept. Not being able to interrupt these is a problem for long-running programs when you have to deal with hosts that aren't always responsive.

When a thread has the ThreadDied status, it can be interrupted (doesn't do anything, since the thread isn't executing code anymore). When it has ThreadBlocked status, it depends on the reason. BlockedOnForeignCall means the thread can't be interrupted because it's running native code. I'm not sure about BlockedOnOther. All the other block reasons can be interrupted, since the thread is under the control of the RTS in these cases.