apple / swift-nio-http2

HTTP/2 support for SwiftNIO
https://swiftpackageindex.com/apple/swift-nio-http2/main/documentation/niohttp2
Apache License 2.0
465 stars 82 forks source link

Race condition causes header frame to be fired on HTTP2StreamChannel AFTER first data frame(s) #410

Closed qusc closed 1 year ago

qusc commented 1 year ago

Hi, I have a pretty weird bug going on, not sure if I'm doing something wrong but I thought I'd share this. I'm on tag 1.17.0. It's fairly easy for me to reproduce but I have quite a lot of custom code involved that I cannot easily share. I'll do my best though to describe what I think is happening.

So whenever we get a new decoded header frame in HTTP2CommonInboundStreamMultiplexer::receivedFrame(_:, context:, multiplexer:) https://github.com/apple/swift-nio-http2/blob/044339de7bdffddb81cb77fc57fb77ff2e743088/Sources/NIOHTTP2/HTTP2CommonInboundStreamMultiplexer.swift#L92 we create a new stream channel and trigger its initializer: https://github.com/apple/swift-nio-http2/blob/044339de7bdffddb81cb77fc57fb77ff2e743088/Sources/NIOHTTP2/HTTP2CommonInboundStreamMultiplexer.swift#L112 Then, we fire the header we got through the pipeline of this new channel as a first frame https://github.com/apple/swift-nio-http2/blob/044339de7bdffddb81cb77fc57fb77ff2e743088/Sources/NIOHTTP2/HTTP2CommonInboundStreamMultiplexer.swift#L113

The problem seems to be, that within the initialization of the new stream channel, eventually HTTP2StreamChannel::performActivation() will be called https://github.com/apple/swift-nio-http2/blob/044339de7bdffddb81cb77fc57fb77ff2e743088/Sources/NIOHTTP2/HTTP2StreamChannel.swift#L305 then https://github.com/apple/swift-nio-http2/blob/044339de7bdffddb81cb77fc57fb77ff2e743088/Sources/NIOHTTP2/HTTP2StreamChannel.swift#L707 which will trigger another read on the pipeline: https://github.com/apple/swift-nio-http2/blob/044339de7bdffddb81cb77fc57fb77ff2e743088/Sources/NIOHTTP2/HTTP2StreamChannel.swift#L712 That in turn might cause another frame to be processed in https://github.com/apple/swift-nio-http2/blob/044339de7bdffddb81cb77fc57fb77ff2e743088/Sources/NIOHTTP2/HTTP2CommonInboundStreamMultiplexer.swift#L70 recursively and fire that second frame through the pipeline of the now existent (in HTTP2CommonInboundStreamMultiplexer::streams) stream channel! The first (header) frame payload is still to be delivered since HTTP2CommonInboundStreamMultiplexer::receivedFrame(...) is still on the call stack, waiting for the initializer of the new stream channel to return here: https://github.com/apple/swift-nio-http2/blob/044339de7bdffddb81cb77fc57fb77ff2e743088/Sources/NIOHTTP2/HTTP2CommonInboundStreamMultiplexer.swift#L112

No idea why this bug wouldn't have appeared for anyone else earlier. Seems pretty fundamental. What could I be doing wrong here? I'm using a custom pipeline underneath the HTTP2 handler which implements a proxy connection / virtual socket.

Wrapping self.tryToRead() and self.deliverPendingWrites() in eventLoop.execute() like so

self.eventLoop.execute {
  self.tryToAutoRead()
  self.deliverPendingWrites()
}

in https://github.com/apple/swift-nio-http2/blob/044339de7bdffddb81cb77fc57fb77ff2e743088/Sources/NIOHTTP2/HTTP2StreamChannel.swift#L305 so that new frames are only read after the header has finished processing seems to solve the issue for me.

FranzBusch commented 1 year ago

From a quick look this looks like a reentrancy problem where as you pointed out we can re-order reads. This is a bit of an interesting one since anything can potentially happen between these two method calls

channel.configureInboundStream(initializer: self.inboundStreamStateInitializer)
channel.receiveInboundFrame(frame)

I am wondering if we could just reorder those two calls so that the read gets buffered before we configure the stream channel; however, I am deferring this to @Lukasa who is more familiar with the insides of our H2 stack.

Lukasa commented 1 year ago

Note that tryToAutoRead does not itself end up triggering a call to HTTP2CommonInboundStreamMultiplexer.receivedFrame. This can only happen if we hit the following series of events:

  1. HTTP2StreamChannel.read0
  2. unsatisfiedRead is false, so we set it to true.
  3. self.pendingReads.count == 0
  4. Leading to a call to self.parent?.read()
  5. This calls read on the parent channel.

So I think the reason this hasn't happened to anyone else is that it's very uncommon for read() to actually unbuffer a channelRead message. Typically it only sets flags that are operated on the next EL spin. This is not always true, however: some channel handlers will buffer up data that is freed on a read call, which might be happening to you. However, as most HTTP/2 handlers operate directly on socket channels, read is not going to synchronously trigger a read system call, and so will not emit more data directly.

Franz's suggestion of a fix seems appropriate: we can safely buffer up that headers frame and then hit the configure flow. @qusc would you like to write a test for this? You'll need to use a custom ChannelHandler to trigger the re-entrancy on read. You're then also welcome to fix the issue.

qusc commented 1 year ago

Thanks for your support! Let me know if the fix and test case I added to my pull request are implemented appropriately @Lukasa