apple / swift-nio

Event-driven network application framework for high performance protocol servers & clients, non-blocking.
https://swiftpackageindex.com/apple/swift-nio/documentation
Apache License 2.0
7.99k stars 651 forks source link

Discussion: Make Channel Handler API Typesafe #1354

Open helje5 opened 4 years ago

helje5 commented 4 years ago

This just came up in the server Slack and @weissi asked me to do a demo PR for discussion.

Issue

The pipeline is a linked list of channel handlers, those channel handlers have static input and output types for both I/O directions: Example:

private final class HTTPHandler: ChannelInboundHandler {
    public typealias InboundIn = HTTPServerRequestPart // InboundOut = Never
    public typealias OutboundOut = HTTPServerResponsePart // OutboundIn = Never
...
    func channelRead(context: ChannelHandlerContext, data: NIOAny) {
        let reqPart = self.unwrapInboundIn(data)

The type erasure in NIOAny is inconvenient and error prone. This is what would be desirable:

private final class HTTPHandler: ChannelInboundHandler {
    func channelRead(context: ChannelHandlerContext, data: HTTPServerRequestPart) {
...
}

Right now it is possible to configure a pipeline which doesn't match the static types, e.g. you can just do this:

pipeline.addHandler(HTTPDecoder()).then {
  pipeline.addHandler(SSLHandler())
}

Which will crash at runtime.

Now I can't say how the actual NIO API should be transformed best, but in Noze I have this for the type safe streams. It guarantees that only compatible streams can be piped together:

public extension GReadableStreamType {
  func pipe<TO: GWritableStreamType>
                (_ outStream: TO,
                 endOnFinish: Bool = true, passErrors: Bool = true)
              -> TO
              where Self.ReadType == TO.WriteType

(or alternatively an operator, which allows stuff like request | deflate | decode<Event> | ..., but stream pipes are a little different to the two-way channel handlers)

The NIO thing might look something like:

extension ChannelHandler {
  func addHandler<Next>(_ handler: Next)
    where Next.InboundIn == Self.InboundOut && Next.OutboundOut == Self.OutboundIn
  @availability(unavailable)
  func addHandler<Next>(_ handler: Next) where Next.InboundIn == Never
  ...
}

(the actual API would probably look quite a bit different, with Futures and all that)

Advantages:

Disadvantage:

Note: This is not about a static pipeline at all (something @weissi is also considering). The pipeline is still dynamic, just the connection points become type-safe, i.e. you can hookup incompatible handlers.

Inserting

Inserting channels: @weissi asked about this example:

pipeline.insert(SSLHandler(), at: 1) // or sth like that.

Again I'm not sure how the actual API would look like. In Noze you'd conceptually have to grab a handle to the previous stream and then re-setup the streams, say:

let oldTarget = socket.unpipe()
let ssl = SSLHandler()
socket.pipe(ssl).pipe(oldTarget) // or just socket | ssl | oldTarget

Which is fully type safe again (the compiler wouldn't allow a setup with incompatible types).

Type Erased API

While a fully type safe API would be best, this could also do some type erasure for compat or other reasons. But instead of doing the type wrapping/unwrapping in the channel handlers for every packet, it would be much better to do it when the handlers are added / inserted. E.g. there could still be a pipeline.insert(anyHandler, at: 1), which would then do the wrapping/unwrapping of the handler for the required pre/post handlers in the pipeline (and fatalError at that point on mismatch).

Disclaimer

This is mostly an idea. I think it is highly desirable for various reasons. I originally thought that NIOAny is for performance reasons, but maybe those do not apply anymore with Swift 5, @inlinable etc.

Lukasa commented 4 years ago

Thanks for putting this idea here. I have a few suggestions for areas where this idea needs to elaborate on some of the difficulties.

Missing Types

Right now your proposal looks like this:

extension ChannelHandler {
  func addHandler<Next>(_ handler: Next)
    where Next.InboundIn == Self.InboundOut && Next.OutboundOut == Self.OutboundIn

Not all ChannelHandlers are required to provide all 4 associated types. In particular:

In this context, we'd want to propagate the type information: whatever the InboundOut of the previous handler was becomes our InboundIn and then our own InboundOut, and a similar requirement would need to be applied for the outbound types.

Insertion

You've called out that you aren't sure about insertion, and then proposed a rough suggestion. This suggestion needs elaboration, because it's where the bulk of the difficulty comes from.

There are three cases: inserting at the front, inserting between two handlers, and inserting at the back. The second case is the easiest: by definition to insert between two handlers requires that the inserted handler has InboundIn==InboundOut && OutboundIn==OutboundOut. The wrinkle comes when we are at the head or the tail.

For the head, there is implicitly a ChannelHandler at the front that has InboundOut == <ChannelInType> && OutboundIn == <Channel out type>, where the channel native I/O types are some of: ByteBuffer, AddressedEnvelope<ByteBuffer>, IOData, and Never. These types need to come from somewhere, and they would need to do so at compile time. To work around this we would have to wait for fully-fledged opaque result types to carry this type information, so in the meantime we'd just quietly have to assume that these were Any.

For the tail, there is again an implicit channel handler where InboundIn and OutboundOut are Any, with one wrinkle, which is that Channel.write is also untyped. This means that Any actually means two things in this type system: in some places it means "I don't care about the type, I'll preserve it", and in some places it means "I don't know about the type, it may be anything".

I don't want to crush this idea, but I do want to point out the places where it has to be clarified in order to work well.

helje5 commented 4 years ago

Not all ChannelHandlers are required to provide all 4 associated types

But this is only the case because you use them as the base protocol for type erasure, right? Conceptually a handler should always have all four of them, with the leafs being typed to Never. If you want a handler which just does book keeping (or handle errors), you'd have to make it generic. Like:

class ReadCounter<InboundIn, OutboundOut>: ChannelHandler {
  typealias InboundOut = InboundIn
  typealias OutboundIn = OutboundOut
  read(_ data: InboundIn) {
    counter += 1
    fireRead(data)
  }
  write(_ data: OutboundOut) { fireWrite(data) }
}
Lukasa commented 4 years ago

Sure, that's not a problem, though it does make the change semver major.

Lukasa commented 4 years ago

Ah, no, I guess it doesn't have to, we can always allow both APIs.

helje5 commented 4 years ago

Insertion head

For the head, there is implicitly a ChannelHandler at the front that has InboundOut == && OutboundIn == , where the channel native I/O types are some of: ByteBuffer, AddressedEnvelope, IOData, and Never. These types need to come from somewhere

Well, yes. But the bootstrap callback does get the type of the channel it is bootstrapping. So depending on the socket you are setting up, you'd get the initial hooking point to conform to say HookupPoint<In: ByteBuffer, Out: ByteBuffer>.

I don't get why it would need to be Any.

helje5 commented 4 years ago

Wrt to SemVer, I don't know - I didn't claim SemVer compat. A clean implementation would probably change the API and internals, but I suppose it could also be done as an addition to the current API (with the stuff still being held type erased, don't know).

Lukasa commented 4 years ago

I don't get why it would need to be Any.

It needs to be Any if we allow insertion at the head after the bootstrap has occurred.

helje5 commented 4 years ago

It needs to be Any if we allow insertion at the head after the bootstrap has occurred.

Nope, the bootstrap can forward the type/hooking-point to for example the protocol switcher if you want to be type safe (as shown in my Noze sample), but you can also always hack-insert if you really want to (and have the type checked at runtime).

Essentially something like:

bootstrap { firstChannelHandler in
  firstChannelHandler.addHandler(ProtocolSwitcher(firstChannelHandler))
}

class Switcher<T>  {
  source: T

  func switch(to other: T) {
    source.remove(self)
    source.addHandler(other)
  }
}
Lukasa commented 4 years ago

We'd need to make sure to standardise that pattern to allow for libraries to request that functionality in a standard way, but I think that's reasonable.

helje5 commented 4 years ago

with one wrinkle, which is that Channel.write is also untyped

Hm, OK. With a clean implementation I'd assume that this would go. For compat it could just do the dynamic thing.

Channel.write does two things, right: 1) lookup the tail 2) synchronise the loop

I think 1) should be done in the app level. It likely has the tail (it must know it, to know what it can put into the write anyways).

And 2) is something which could also live in the handler itself, like threadSafeWrite() or whatever the name would be, just doing the "isInLoop else dispatch" thing.