xmtp / xmtp-js

XMTP client SDKs, content types, and other packages written in TypeScript
https://xmtp.org/docs
211 stars 40 forks source link

Accommodating different message payload types #54

Closed mkobetic closed 2 years ago

mkobetic commented 2 years ago

The API currently accepts a string as the message payload. While there are no constraints as far as the contents of the string goes that is suboptimal if we want to support payloads other than plain text. A UInt8Array would signal better that the payload can be arbitrary, but then the recipient would need to employ some heuristics in figuring out what's the best way to present the payload to the user. Some way to signal what the content type is seems necessary.

There are many different ways to handle this and a lot of legacy to possibly draw on (MIME Content-Type field springs to mind). A more modern approach like https://docs.ipfs.io/concepts/content-addressing/ suggested by @neekolas could serve as an inspiration too.

I'd say we should aim for something simple and flexible that will allow users to introduce their own without requiring a central authority to manage a registry of accepted content types. Is it acceptable that some clients will simply refuse to render content-types they do not recognize?

neekolas commented 2 years ago

One other consideration is whether we want messages to have one encoding, or potentially multiple included.

For example, you could include a fancy encoding that only some clients support and also a simple text encoding that is universally accessible.

mkobetic commented 2 years ago

MIME multipart messages :)

mkobetic commented 2 years ago

Re central registry, we might be able to have the cake and eat it if the types as namespaced (let's say with a domain), then xmtp.org can manage its own type definitions and anyone else can introduce their own scoped to their domain.

neekolas commented 2 years ago

Totally agree we shouldn't have a central registry. I could see some solution where users of the SDK could pass in a map of content type identifiers and their corresponding encoder/decoder. We would just need to define a standard interface for the encoder/decoder and a format for the content type identifier.

We would probably build in a few defaults, like "text", and have the rest be opt in through extra packages.

mkobetic commented 2 years ago

I'm not sure we have to get into the business of actual payload encoding decoding. We could also just provide a way to submit the type indicator and the payload bytes assuming the client already did the job of encoding it into bytes. Similarly on the other end.

mkobetic commented 2 years ago

The main reason I'm somewhat hesitant to get into the content handling business is that it may create the expectation that any content can always be handled. But unless the client can dynamically discover new content types at runtime and dynamically load corresponding decoders, I think there's always a chance of encountering unknown content types in the wild. And even if we do "correctly" decode the content, that still doesn't guarantee the client renders the content correctly.

neekolas commented 2 years ago

Agree that we shouldn't create the expectation that the client will magically handle any message it receives. But it does seem like any instance of the client should be able to define the allowed message types at the time of client instantiation.

I just don't love the UX of making the caller have to think about the message as bytes. It means that for every message they receive, the developer have to go through the process of figuring out which decoder to use and then decoding it. Seems like something we can handle nicely as part of client instantiation instead of on a per message basis. But it would still be up to the developer to define which message formats they would support, and the expectation is that any unsupported format would be ignored (maybe we log a warning).

mkobetic commented 2 years ago

OK, that does sound like more user friendly approach. I was somewhat attracted by the surgical cleanliness of what I was proposing, but I definitely concede that doesn't imply user friendliness which is at least equally important.

neekolas commented 2 years ago

In my head I was imagining something (very roughly) like this. We would define the expected interface for the encoder/decoder and the format for the identifier, and then let developers use whatever messaging formats they see fit.

interface MessageContents {
    contentType: string;
    data: Uint8Array;
}

interface MessageConverter<T> {
    acceptedContentTypes: string[];
    encode(message: T): MessageContents;
    decode(contents: MessageContents): T;
}

const TextConverter = (): MessageConverter<string> => {
    return {
        // Will want to put a bunch more thought into the format of the content type here
        acceptedContentTypes: ['/v1/xmtp/text'],
        encode(message: string): MessageContents {
            return {
                contentType: '/v1/xmtp/text',
                data: new TextDecoder().encode(message)
            }
        },
        decode(contents: MessageContents) {
            return new TextDecoder.decode(contents.data);
        },
    };
};

const client = await Client.create(wallet, { messageConverters: [TextConverter()] });
neekolas commented 2 years ago

I actually don't think we would want to accept primitive types at all in conversation.send(...) (no strings, no UInt8Arrays, no numbers). We probably want to take in some object/class that includes both the content type and the data, so we can ensure the content type is always set to something on the message before we encrypt it.

neekolas commented 2 years ago

Just thinking out loud. If we want to define an interface, we'll want to go through some very careful deliberation about how that interface works, since it will be difficult to change once people start using it.

mkobetic commented 2 years ago

That looks really good to me. And I totally agree about the necessity to get the interface right. Two things that I think need addressing:

1) Multipart content that you brought up: We could support it natively at this level, or we could leave it to be provided as a subordinate content type. Unless we see some serious drawback there I'd be inclined to try the later as it keeps the primary interface clean and simple

2) Drawing on the MIME experience, they ended up supporting the ability to attach attributes to the primary content type identifier like the character encoding used or a fileName. Should we support a more structured form of the contentType field than just a string?

I feel it might be worth to scour the MIME spec and check out the relevant fields Content-Type, Content-Disposition, etc... It was a horrible mess in the end, but that doesn't mean there aren't some useful lessons there.

neekolas commented 2 years ago

Agree both of those are important considerations. I have no idea what the right answer is. There's literally decades of prior art here. You wanna take the lead on doing the research and putting together a RFC?

mkobetic commented 2 years ago

Yup, I can dig in and draft a proposal.

mkobetic commented 2 years ago

@neekolas, trying to write up the message conversion proposal, I'm struggling with the encoding side. How can we dispatch to the right encoder based on the type of message being passed into the API? IIUC Type is not a thing at runtime. Might make sense to associate the encoder with the message type itself, and maybe require it to respond to messageEncoder() function? Not sure how we'd allow for a plain string to be a message but maybe we can hardcode that case somehow?

neekolas commented 2 years ago

So, one possibility is that we just force all messages to be passed in as a class/object that includes the contentType as a field. We would have to hack something in for strings, like checking the typeof and wrapping in an object.

mkobetic commented 2 years ago

Rather than forcing every instance/object to have a field set explicitly, could we somehow drive it off the object prototype?

neekolas commented 2 years ago

You're right that we can't get the Typescript types at runtime. Definitely possible with object prototypes. We could include the object prototypes that each encoder is capable of encoding in the mapping provided to the Client.

That does get finicky as any message object that gets passed in would have to be of a whitelisted Object kind. There should be a clear path for developers to come up with their own implementations of already-specified content types.

mkobetic commented 2 years ago

I think I've just wasted a day trying to come up with a way to turn the client API into generic functions (send/receive) to be able to capture the type of content being passed in. I think my fundamental misunderstanding was that a generic function/type is not an actual function or type that can accommodate arbitrary type, it's a function/type template that can be instantiated with a specific type to get an actual function or type.

So is the natural conclusion that we can't properly type the input/output of the send/receive API on client and have to basically accept any (or a non-generic interface) and then dispatch internally based on the contentType identifiers or something? Is it possible to do any better than that?

This might clarify what I was trying to do https://github.com/xmtp/xmtp-js/pull/68/files#diff-5ffc3958a4528ff8b9749585dfa24336c47967cfbb2491f6bca90e6468ea31a5R24-R28

neekolas commented 2 years ago

I see what you mean about the generic functions.

We could probably at least enforce that the sender/receiver pass in an object with a bare bones interface rather than truly anything. Something like:

interface MessageContent {
    contentType: string
    content: any
}
mkobetic commented 2 years ago

Sigh, so I went with the interface, but I'm still stuck with the generic Encoder type. Not sure how to get around this problem

https://github.com/xmtp/xmtp-js/pull/68/files#diff-b7961e132a9ada7fe5b4cdec69198fc7e4fe506e45b5a7fe29f399a01c45ff53R155-R157

The encoder registry cannot be defined in terms of the generic Encoder type. 🤔

mkobetic commented 2 years ago

BTW, I'm not trying to bypass the RFC stage, I'm just trying to prototype what the API could look like, because it doesn't seem to be obvious.

neekolas commented 2 years ago

I think the list in the client needs to be MessageEncoder<any>. The individual encoders can be typed specifically, but the array can't because it's a mixed bag.

mkobetic commented 2 years ago

Hm, trying something like encoders: Map<string, MessageEncoder> tells me that generic type requires 1 type argument.

mkobetic commented 2 years ago

I pushed what I have to that branch if you want to take a closer look.

mkobetic commented 2 years ago

Ah, OK, MessageEncoder<any> might be workable, I'll run with it

mkobetic commented 2 years ago

Although I wonder whether MessageEncoder<string> is assignable to MessageEncoder<any>. I'll see.

neekolas commented 2 years ago

Should be

mkobetic commented 2 years ago

OK, MessageEncoder<any> seems to have worked out. I'll write it all up and then we can discuss where to go from here. Thank you for helping me through those hang ups ❤️

saulmc commented 2 years ago

Moved back to In Progress for:

saulmc commented 2 years ago

Resolved by #68 #96