kazu-yamamoto / http2

HTTP/2.0 library including HPACK
BSD 3-Clause "New" or "Revised" License
86 stars 23 forks source link

Fix treatment of async exceptions #138

Closed edsko closed 4 months ago

edsko commented 4 months ago

In https://github.com/kazu-yamamoto/http2/pull/92 we added an exception handler that was meant to catch all exceptions (sync and async). This got changed in https://github.com/kazu-yamamoto/http2/pull/114 (specifically, https://github.com/kazu-yamamoto/http2/pull/114/commits/52a9619ba95b67d469205cb0dea546ada8489baa): when we moved from Control.Exception to UnliftIO.Exception, we got a different behaviour for catch and friends (see https://github.com/well-typed/grapesy/issues/193#issuecomment-2238704595) for a full list. This commit fixes some unintended consequences of this change.

I tried to do an exhaustive check of all uses of these functions in http2, I'll post a full report separately.

edsko commented 4 months ago

I checked all uses of catch and friends (all functions that ignore async exceptions in unliftio but not in base) in http2:

I also checked time-manager, which contains this twice:

onTimeout `E.catch` ignoreAll

The intention here is I think to ignore exceptions thrown by the timeout handler itself, so it's indeed correct to not catch async exceptions here. Left this one unchanged also.

kazu-yamamoto commented 4 months ago

After discussing with @khibino some days ago, I began to think that asynchronous exceptions are a bad pattern in Haskell network programming. I should think whether or not I can remove asynchronous exceptions from http2.

edsko commented 4 months ago

I think the test failure we're seeing is unrelated to this PR. I can produce it on my machine on main also:

HTTP2.Server
  server
    handles normal cases [ ]

(then timeout). It happens only very rarely, but it does happen. I'm also seeing these show up from time to time, on main:

  test/HTTP2/ServerSpec.hs:47:9: 
  1) HTTP2.Server.server handles normal cases
       uncaught exception: IOException of type NoSuchThing
       Network.Socket.connect: <socket: 13>: does not exist (Connection refused)

  To rerun use: --match "/HTTP2.Server/server/handles normal cases/" --seed 148294499
edsko commented 4 months ago

Have rebased on latest main.

edsko commented 4 months ago

I should think whether or not I can remove asynchronous exceptions from http2.

Asynchronous are indeed notoriously difficult to deal with. Removing them from http2 completely would be difficult however: how are you going to kill worker threads when they time out, or tell them that the client has disappeared? I suppose you could them an (T)MVar to poll, and leave it their responsibility. Indeed, in grapesy I do something along these lines: I run the worker in separate threads, that http2 is unaware of, and the main thread just sits there waiting for either the main thread to terminate or http2 to throw a (KilledByHttp2ThreadManager) exception.

This feels like quite a large design departure (though a compat shim could be provided that just spawns an additional thread, waitings on the (T)MVar, and kills the main thread when told to, I guess).

kazu-yamamoto commented 4 months ago

Threads should check STM in the beginning of each loop. They can check if their sockets are ready for reading:

import Control.Concurrent
import Control.Concurrent.STM
import System.Posix.Types

checkReadAvailable :: Socket -> IO (STM (), IO ())
checkReadAvailable s = withFdSocket s $ \fd -> threadWaitReadSTM $ Fd fd

Timeout can be implemented with SystemTimerManager.

kazu-yamamoto commented 4 months ago

Thinking this issue for a day, I decided:

(1) I will merge #137 and #138 as a workaround (2) Then I will get rid of asynchronous exceptions someday

edsko commented 4 months ago

Yes, a polling setup like you describe is possible. For the timer manager, do you mean https://hackage.haskell.org/package/base-4.20.0.1/docs/GHC-Event.html#t:TimerManager ? If so, I guess we could, and then register an action that writes to an another STM variable, so that you can poll that also.

edsko commented 4 months ago

Thinking this issue for a day, I decided:

(1) I will merge #137 and #138 as a workaround (2) Then I will get rid of asynchronous exceptions someday

Ok, that works for me :) If and when you have a PR that removes all async exceptions, feel free to ping me to try it out with grapesy.

kazu-yamamoto commented 4 months ago

For the timer manager, do you mean https://hackage.haskell.org/package/base-4.20.0.1/docs/GHC-Event.html#t:TimerManager ?

Yes. @khibino and I actually use it in dnsext libraries.

kazu-yamamoto commented 4 months ago

If and when you have a PR that removes all async exceptions, feel free to ping me to try it out with grapesy.

Thanks. I will ping you!

kazu-yamamoto commented 4 months ago

@edsko #137 has been merged. Unfortunately, #138 cannot be merged straightforwardly. Would you resolve conflicts and rebase this PR onto the current main?

edsko commented 4 months ago

Yup, will do!

edsko commented 4 months ago

@kazu-yamamoto I have rebased (and also ran fourmolu). Let me just try this out with the grapesy test suite also before merging.

edsko commented 4 months ago

Ok, the grapesy test suite says all is fine :) (https://github.com/well-typed/grapesy/pull/196). I think this is good to go!

kazu-yamamoto commented 4 months ago

Merged. A new version has been released. Thanks!

edsko commented 4 months ago

Thanks @kazu-yamamoto !