docker / libchan

Like Go channels over the network
Apache License 2.0
2.47k stars 142 forks source link

handling many request types #73

Open tve opened 9 years ago

tve commented 9 years ago

Is there a recommended pattern for implementing a server that can handle a multitude of different requests? The question I have is how to represent the request. For example, suppose I have the following requests:

type EchoRequest struct {
  Cmd: string // "echo"
  Text: string
  Ret: libchan.Sender
}
type AddRequest struct {
  Cmd: string // "add"
  A, B: int
  Ret: libchan.Sender
}

What would I Receive into on the server side without going through map[interface{}]interface{} hell? The cleanest I can come up with is to give each request type its own (well-typed) channel and to make the root channel just send a map of request-type=>channel. If these were Go channels I would receive into an interface{}, look up the command, and then coerce the interface{} to EchoRequest or AddRequest. But that doesn't work with libchan as far as I can see. I'm probably missing something obvious...

stevvooe commented 9 years ago

There are a few ways to do this but the easiest is to have a type with pointers to each possible request type. This works kind of like a union. Here is an example:

type RequestEnvelope struct {
    *EchoRequest
    *AddRequest
}

These can be received over a channel and you can dispatch on the non-nil request. While the in-memory object may require space for the pointers, the codec should omit nil-valued pointer, so you'll only pay for what is used on the wire.

Another more complex option would be to have a channel for each request type, with a goroutine for each channel. This approach might be good for protocols that require a lot of concurrent interaction over the channel. In the future, some sort select functionality would make this the recommended approach.

tve commented 9 years ago

Can you post a snippet of code that shows how your struct of pointers works? I tried the following and Encode barfs with Encode: reflect: indirection through nil pointer to embedded struct

package main

import (
        "fmt"
        "log"
)
import "github.com/dmcgowan/go/codec"

type A struct {
        A1 int
        A2 string
}
type B struct {
        B1 string
        B2 int64
}
type MSG struct {
        *A
        *B
}

func main() {
        var mh codec.MsgpackHandle
        b := make([]byte, 10240)

        a := A{3, "foo"}
        msg := MSG{&a, nil}

        enc := codec.NewEncoderBytes(&b, &mh)
        err := enc.Encode(msg)
        if err != nil {
                log.Fatalf("Encode: %s", err)
        }

        var rcv interface{}
        dec := codec.NewDecoderBytes(b, &mh)
        err = dec.Decode(&rcv)
        if err != nil {
                log.Fatalf("Decode: %s", err)
        }

        fmt.Printf("Sent %#v\n", msg)
        fmt.Printf("Recv %#v\n", rcv)
}
tve commented 9 years ago

Apologies for the multiple comments. I'll stop here. I did manage to get a variant of the above to work, which I'm starting to warm up to. In this example there are two message types: A and B. Note how the sender wraps the message of type B into a struct that has only a B, which allows the sender to decode it into a MSG struct that has a pointer to all the possible messages like you suggested. The nice thing here is that the sender doesn't need to know about any message type other than the one he sends.

package main

import (
        "fmt"
        "log"
        "reflect"
)
import "github.com/dmcgowan/go/codec"

type A struct {
        A1 int
        A2 string
}
type B struct {
        B1 string
        B2 int64
}
type MSG struct {
        A *A
        B *B
}

func main() {
        mh := codec.MsgpackHandle{}
        mh.MapType = reflect.TypeOf(map[string]interface{}(nil))
        buf := make([]byte, 10240)

        b := B{"hello", 3}
        msg_b := struct{ B B }{b}
        fmt.Printf("Send %#v\n", msg_b)

        enc := codec.NewEncoderBytes(&buf, &mh)
        err := enc.Encode(msg_b)
        if err != nil {
                log.Fatalf("Encode: %s", err)
        }

        var rcv interface{}
        dec := codec.NewDecoderBytes(buf, &mh)
        err = dec.Decode(&rcv)
        if err != nil {
                log.Fatalf("Decode: %s", err)
        }
        fmt.Printf("Raw Recv: %#v\n", rcv)

        var msg MSG
        dec = codec.NewDecoderBytes(buf, &mh)
        err = dec.Decode(&msg)
        if err != nil {
                log.Fatalf("Decode: %s", err)
        }
        fmt.Printf("Msg Recv: %#v\n", msg)
        fmt.Printf("Msg.A Recv: %#v\n", msg.A)
        fmt.Printf("Msg.B Recv: %#v\n", msg.B)
}

The result of running this is:

$ go run test.go
Send struct { B main.B }{B:main.B{B1:"hello", B2:3}}
Raw Recv: map[string]interface {}{"B":map[string]interface {}{"B2":3, "B1":[]uint8{0x68, 0x65, 0x6c, 0x6c, 0x6f}}}
Msg Recv: main.MSG{A:(*main.A)(nil), B:(*main.B)(0xc2080433a0)}
Msg.A Recv: (*main.A)(nil)
Msg.B Recv: &main.B{B1:"hello", B2:3}
mcollina commented 9 years ago

Folks, libchan in its form is both a spec and an implementation, and it should be designed to be implementable in other platform. Strings works cross-platform. Pointers definitely not. Il giorno lun 8 dic 2014 alle 03:49 Thorsten von Eicken < notifications@github.com> ha scritto:

Apologies for the multiple comments. I'll stop here. I did manage to get a variant of the above to work, which I'm starting to warm up to. In this example there are two message types: A and B. Note how the sender wraps the message of type B into a struct that has only a B, which allows the sender to decode it into a MSG struct that has a pointer to all the possible messages like you suggested. The nice thing here is that the sender doesn't need to know about any message type other than the one he sends.

package main

import ( "fmt" "log" "reflect" ) import "github.com/dmcgowan/go/codec"

type A struct { A1 int A2 string } type B struct { B1 string B2 int64 } type MSG struct { A A B B }

func main() { mh := codec.MsgpackHandle{} mh.MapType = reflect.TypeOf(map[string]interface{}(nil)) buf := make([]byte, 10240)

    b := B{"hello", 3}
    msg_b := struct{ B B }{b}
    fmt.Printf("Send %#v\n", msg_b)

    enc := codec.NewEncoderBytes(&buf, &mh)
    err := enc.Encode(msg_b)
    if err != nil {
            log.Fatalf("Encode: %s", err)
    }

    var rcv interface{}
    dec := codec.NewDecoderBytes(buf, &mh)
    err = dec.Decode(&rcv)
    if err != nil {
            log.Fatalf("Decode: %s", err)
    }
    fmt.Printf("Raw Recv: %#v\n", rcv)

    var msg MSG
    dec = codec.NewDecoderBytes(buf, &mh)
    err = dec.Decode(&msg)
    if err != nil {
            log.Fatalf("Decode: %s", err)
    }
    fmt.Printf("Msg Recv: %#v\n", msg)
    fmt.Printf("Msg.A Recv: %#v\n", msg.A)
    fmt.Printf("Msg.B Recv: %#v\n", msg.B)

}

The result of running this is:

$ go run test.go Send struct { B main.B }{B:main.B{B1:"hello", B2:3}} Raw Recv: map[string]interface {}{"B":map[string]interface {}{"B2":3, "B1":[]uint8{0x68, 0x65, 0x6c, 0x6c, 0x6f}}} Msg Recv: main.MSG{A:(_main.A)(nil), B:(_main.B)(0xc2080433a0)} Msg.A Recv: (*main.A)(nil) Msg.B Recv: &main.B{B1:"hello", B2:3}

— Reply to this email directly or view it on GitHub https://github.com/docker/libchan/issues/73#issuecomment-65970133.

dmcgowan commented 9 years ago

@mcollina I don't think the use of pointers here is anything that would cause compatibility issues. However I am not sure I agree with the general pattern of having a shell type and selecting the type based on the nil/empty. The msgpack implementation does lack some features but I think the ideal schema should still be preferred, even if that means using map[interface{}]interface{}. Most other languages and my guess the jschan implementation will not have as much difficulty with the typing. For the go implementation I am working on a RawMessage type similar to what is in the Go encoding/json package which would allow for partial and late decoding of the message.

My thought is the following is how your structs may end of looking. Add post receiving a request and checking the cmd, RawMessage would be able to be decoded into the expected type. Until then the same pattern (add same over the wire msgpack) could be achieved with map with decoders from map[interface{}]interface{}.

type EchoArgs struct {
        A1 int
        A2 string
}

type AddArgs struct {
        B1 string
        B2 int64
}

type EchoRequest struct {
  Cmd  string // "echo"
  Data *EchoArgs
  Ret  libchan.Sender
}

type AddRequest struct {
  Cmd  string // "add"
  Data *AddArgs
  Ret  libchan.Sender
}

type Request struct {
Cmd  string
Data *msgpack.RawMessage
Ret  libchan.Sender
}
tve commented 9 years ago

@dmcgowan While I don't disagree with the "partial and late decoding" do look at what the pointer approach produces. In particular, the message sent just has one of the "branches" of the union in it. It doesn't even have all the nil pointers. It seems to me that the resulting msgPack encoding is actually pretty darn good (and portable):

Send struct { B main.B }{B:main.B{B1:"hello", B2:3}}
Raw Recv: map[string]interface {}{"B":map[string]interface {}{"B2":3, "B1":[]uint8{0x68, 0x65, 0x6c, 0x6c, 0x6f}}}
stevvooe commented 9 years ago

@mcollina

Folks, libchan in its form is both a spec and an implementation, and it should be designed to be implementable in other platform. Strings works cross-platform. Pointers definitely not.

While pointers may not work in certain languages, the presence or absence of a value should definitely be supported. There is no way this concept isn't trivially implementable for other platforms (java: Option, c++: pointer, js: null or missing key, python: None or missing key, scala: Option, haskell: Maybe, etc.). In Go, it's idiomatic among the encoding libraries (json, gob, xml, protobuf, etc.) to support the notion of "zero pointer" to communicate absent serialized values. Not supporting nil pointers in this manner is at odds with familiar behavior in existing Go encoding libraries.

@tve I should have vetted my example but I chose to express the fields as embedded anonymous structs. Apologies. Currently, libchan does not correctly support embedded structs, which needs to be fixed. The example should have been this:

type RequestEnvelope struct {
    EchoRequest *EchoRequest
    AddRequest *AddRequest
}

It looks like you discovered that ;).

dmcgowan commented 9 years ago

@stevvooe in the case of msgpack, omitting a nil pointer should really be done by annotation flags rather than assuming a nil pointer will not serialize. In msgpack nil is a valid value and it could be included, even if the codec library by default excludes it. Pointers are definitely valid and of course dereferenced before encoding, what is probably odd to non-go devs is the role of annotations in cuing the encoder to do the desired behavior. See http://golang.org/pkg/encoding/json/#Marshal

tve commented 9 years ago

I don't think this is just about omitting nil pointers or not. It's also about extensibility. At the very least the server needs to be able to support a superset of what the client knows about. The most basic principle is that the server needs to be able to recognize new fields that older clients may not send. Ideally, if the sender includes fields that the recipient doesn't recognize then they are ignored.

mcollina commented 9 years ago

@dmcgowan as long as messages are dereferenced in a single JSON-like blob of maps and arrays, that's fine. I think I misunderstood a chunk of what is discussed here (being go-newbie):

type Request struct {
Cmd  string
Data *msgpack.RawMessage
Ret  libchan.Sender
}

This is just an "internal" and generic Request struct, it is not something that is sent/recevied on the wire as it is. Whatever is in the data field will be translated into a msgpack5 map during encoding, but I am not sure how the type information will be transferred over to support decoding. As long as there is no custom msgpack5 extids around any dereferencing will be fine.

@stevvooe as @dmcgowan said, the msgpack5 spec says that 'nil' exists, and the most common mapping of the 'NULL' type is with msgpack5 nil. In my JS msgpack5 implementation, JS undefined is skipped, while null is passed through. This distinction is really language specific, and I would advise not relying on these nuances for the efficiency of a generic protocol.

@tve I advise against embedding some sort of discovery/version managament into the protocol itself. These services are much better exposed on top of the protocol itself rather than backed in. We had too many failure in that area (RMI, CORBA, even WSDL/SOAP) to bake the 'service governance' into the wire protocol. msgpack is free form, as long as msgpack is concerned we can place anything in it. As discussed with @dmcgowan in Amsterdam, I think having the (above) generic pattern makes complete sense:

type Request struct {
Cmd  string
Data *msgpack.RawMessage
Ret  libchan.Sender
}

I suggest we make this the convention of libchan.

stevvooe commented 9 years ago

@mcollina It's rather unfortunate that msgpack5 chose to include a 'NULL' type in the protocol. Formats like protobuf, gob and thrift gain a lot of efficiency by not transmitting null, default or zero-valued fields. This pattern is so common that the type oneof has even been added to the latest version of protobuf to support these kinds of unions. It would be unfortunate for libchan to not learn from these serialization libraries since they have seen a lot of action and are generally easy to use. They are also fantastic at protocol evolution, both by design and by personal experience.

That said, there is not much we can't express through struct tags. It's likely that we could choose a default behavior that meets common needs but drops nilled fields with the right tag ("union", maybe).

The main issue with the proposed Request envelope is type safety and protocol overview. To understand the requests that may belong in that message, one will have to look at the code that dispatches (Request).Cmd.

Using a declared response union, with all of its possible response types, makes understanding the protocol as simple as looking at a data structure:

type RequestEnvelope struct {
    EchoRequest *EchoRequest
    AddRequest *AddRequest
}

Now, if we add a new request type, we can simple append it:

type RequestEnvelope struct {
    EchoRequest *EchoRequest
    AddRequest *AddRequest
    PingRequest *PingRequest
}
mcollina commented 9 years ago

@stevvooe I thought a little bit more on the issue, and I think your original solution is feasible given some constraints. The key problem here is that default behavior of a msgpack library is to serialize 'null' values as NULL, but a flag 'do not include NULL' can be added to most msgpack libraries. Even if it is not added, we need to ensure that everything still works, i.e. if I send something with NULL it is deserialized correctly on the go-libchan side. In order to leverage those optimization, we just need to write this down in the protocol specification as a suggested optimization (SHOULD in rfc-lingo). The only key thing is not leveraging NULL within the interfaces, so no custom annotation for forcing a NULL value inside the message.

@dmcgowan what do you think?

These are really interesting discussions, thank you all so much!

tve commented 9 years ago

+1 on:

type Request struct {
Cmd  string
Data *msgpack.RawMessage
Ret  libchan.Sender
}

Does anyone have some code to implement this?

dmcgowan commented 9 years ago

@mcollina I think it is fair to document how NULL is expected to be used. I think in Go using NULL will be natural, but if it shows up on the JS chan as a field with a value of undefined, it could cause problems if an implementer is unaware this pattern is being used. So if we were to set a rule for using NULL, it would be that a NULL and omitted value should behave the same in the protocol.

@tve the code implementation is in progress. I hope to have to some time over the holidays to make progress on it because I think it is important to having a performant and type-flexible implementation. The new implementation should also be able to be more lenient on the types of messages that can be sent, allowing non-map/struct types to be sent and received. It has not been PR'ed yet but I can certainly create one to make tracking the changes easier. The working branch is at https://github.com/dmcgowan/libchan/tree/new_msgpack_implementation

mcollina commented 9 years ago

@dmcgowan basically from the JS side of things, we just say null and undefined are going to be seen by Go as NULL, while go NULL will always be undefined. It's just documentation :).

stevvooe commented 9 years ago

Regarding the following Request structure:

type Request struct {
    Cmd  string
    Data *msgpack.RawMessage
    Ret  libchan.Sender
}

Let's say we have a message F, with an io.Reader, along with some request specific fields:

type F struct {
    Headers map[string]string
    Body io.Reader
}

f := &F{
    Header: {"foo": "protocol specific header"}
    Body: body,
}

Now, if we want to send a message like F on a channel, we simply do it and don't worry about it:

if err := sender.Send(f); err != nil {
     //.. handle it
}

This will nicely create a stream for us and clone other values to work over the libchan codec, transparent to the user. This is a very nice feature as a user. But now, we decide we want to send other types of requests, other than F (like G and H), so we pull out the request envelope:

req := Request{
    Cmd: "F",
    Data: SomeEncodingFunction(f),
    Ret: receiver,
}

What does SomeEncodingFunction look like? What is the signature? How does this type interact with the send channels underlying state? For example, if Data has stream references, such as those in F, how are they managed? It seems like the sender/receiver would have to expose encoders that would have to associate them with a specific channel. What if that Data was encoded for one channel but is sent on another?

These possibilities expose a lot of error surface area and prevent the content of Data from being a first class type on the channel. The choices here seem to be to expose a complex encoding api that interacts with a channel or reduce the types that are supported in the child field. The former makes the API more complicated, which is a bad thing, and the latter reduces libchan to the functionality of existing rpc libraries.

It seems like we need some sort of typed channel features that allow one to detect and dispatch based on incoming types of messages. Another option is to have the message come over a special channel:

type Request struct {
    Cmd string
    Message libchan.Receiver
    Response libchan.Sender
}

With this flow, the encoding is not separated from Send and Receive and the incoming request data could be part of a strongly typed channel.

dmcgowan commented 9 years ago

Using raw message does not need to change any of the types. I was mentioning using it for decoding, not encoding. When sending a message which could have multiple types, my suggestion is still to use interface{}. If you want to decode into RawMessage you can either have a decode type or set the interface value in the decoded object to a RawMessage pointer.

type Request struct {
    Cmd  string
    Data interface{}
    Ret  libchan.Sender
}
// request := &Request{Data: new(Msgpack.RawMessage)}
// OR
type DecodeRequest struct {
    Cmd  string
    Data *msgpack.RawMessage
    Ret  libchan.Sender
}
// EVEN COULD SUPPORT THIS
type Request struct {
    Cmd  string
    Data interface{} `libchan:"raw"`
    Ret  libchan.Sender
}

There is quite a few possibilities, but I am staying away from offering my opinion at this time and focusing on ensuring RawMessage works and is useful.

The special function to encode into a RawMessage does not need to exist and should not exist. RawMessage carries around with it the extension context that is needed decode special types such as streams and channels. This is how the RawMessage type interacts with the underlying spdy session and the decoder, there is nothing awkward about the implementation. I am still working on the code but the structure is in place and open to review (added a PR for it in #79). I disagree about the complexity introduced to the encoding and I hope that is evidenced by a working implementation. Let's move any further discussion on this topic to the PR.

lox commented 6 years ago

Are there any up to date examples of this? The included example makes it look like the marshalling is handled internally.