`wasi-messaging` Refactor ft. stakeholder review

danbugs commented 1 year ago

This PR contains changes across the all world interfaces and focuses to address feedback from:

@stevelr ,
@autodidaddict (#8),
@clemensv ,
@devigned , and
@Mossaka .

Thank you everyone for your help and patience w/ this big refactor.

Below, I'll highlight the major changes to this interface. To see it actually being used w/ a dummy implementation, please refer to: https://github.com/danbugs/wasi-messaging-demo .

`types`

receiver, and sender have been unified into client, and the client's only function is to connect to some sort of named message exchange service.
channel is no longer a variant w/ distinctions between broadcast and unicast channel – it's just a type alias to a string as the rest are consumer issues.
added a guest-configuration record that contains a list of subscriptions that a specific guest owns, and an optional key-value pair list of 'extensions', which can contain specific config required for a particular implementor (e.g., partitions/offsets, etc..)
format-spec is now an enum rather than a string type alias to ensure implementors exhaustively support/address different types of data envelopes.
message no longer enforces CloudEvents. Instead, it's a record with a binary payload and content-type-type field (i.e., format) for deserialization.

`consumer`

with the addition of guest-configuration and its' capability to allows us to set specific partitions/offsets to read from, stream-consumer and basic-consumer were unified into a single consumer interface.
added subscribe-try-receive/subscribe-receive methods to fulfill the request-reply pattern – they aim to create an ephemeral subscription to receive from a channel until a specific timeout/indefinitely, respectively. Also, as for some implementors it's uncommon to return a single message, both methods return list<message>.
added update-guest-configuration, which allows for things like: setting a specific partition/offset to read messages from in the case of Kafka/EventHubs, or even just unsubscribe. It's meant to be a fit-all solution for any implementor-specific quirks to be addressed.
added settlement methods for completing/abandoning messages – what the host should do when reaching these states could be configured thorugh guest-configuration

`producer`

just like for the consumer, stream-producer and basic-producer were unified into a single producer.
to mirror consumer, send takes list<message>.

`guest`

as Wasm modules are not meant to be long-running instances, the guest interface was designed to be quickly called and thrown away. Here's the imagined workflow: (1) the host calls configure to obtain a list of implementor-specific configurations (if needed), and a list of channels to subscribe and create a long-running client connection on the host side. At this point, the host can kill this component instance. (2) Once any messages come in for the subscriptions returned by configure, the host can spawn up a new component and call the guest's handler to take care of the incoming message. Equal to the motivations presented in consumer and producer, handler receives a list<message>.

stevelr commented 1 year ago

please bear with the the long-ish response. I want to articulate some goals, so if there's a difference in perspective we can address that before getting stuck in the details of apis.

The difference between basic- and stream- seems to be a difference in metadata associated with the message.

both include subject (aka entity), format (aka type), and optional metadata
stream- also includes partition and offset

Is it possible to combine and simplify so that we end up with only one type of producer and one type of consumer?

If we unwrap all the layers, the full set of metadata considered by the various send/receive apis here is

message data (list<u8>)
subject (string)
type, e.g., mime type (string)
partition (u32)
offset (u32)
optional metadata (list<tuple<...>>)

Sender and Receiver objects embody the subject name. I think that abstraction adds an unnecessary layer of complexity. Some messaging systems (at least NATS and MQTT, possibly others) permit subscribing to a set of topic with wildcards, so subject needs to be delivered with each message. In other words, in the general api, a subscriber can't receive a message without also receiving its subject. Similarly, a single producer can often transmit messages on multiple subjects. In both cases, if subject were a string parameter of send and receive, it would simplify usage and remove the need for the sender and receiver objects.

NATS, PubSub, an MQTTv5 include optional attributes (aka headers or properties) with each message so an optional tuple of properties also seems warranted, for both send() and receive()

The format parameter is sometimes useful. A few are enumerated: MQTT, HTTP, etc. If a format parameter is part of the api, it needs a 'raw' variant as a catch-all if none of the other formats applies. Perhaps a mime type would be more inclusive and handle even more variants. To decide whether it should be included in the send/receive signatures (or a message struct), and incurring the complexity and expense, we want to know how often it's needed. In most cases I've seen, the format is effectively communicated by the subject name and the documentation for the message stream. For example, if you're watching for changes to a database based on a set of keys, or watching a file system for changes to a file, the db and fs specify the format of the event. It's often specified along with the subject name of the event stream, in which case the publisher and subscriber already know the data format.

There are always tacit aspects of a message protocol that are part of its spec, in the documentation. If the implementers of the publisher and consumer have the same understanding of the protocol, and it doesn't very per message, it doesn't need to be sent with each message. Other protocol attributes that are often not explicit fields in the message are compression algorithm and encryption algorithm. In cases where additional or optional metadata is required to describe format, or other message attributes, the format can be included in a property map with other metadata (for example, the Content-Type value in the headers map in an HttpRequest), or the format and other metadata can be put into a wrapper object that becomes the new message (for example, a CloudEvents message has explicit metadata with a data field for the inner payload). If, in the vast majority of cases, format is tacit, or it can be included in metadata or the message object, it doesn't need its own field in the wasi messaging interface, either as a field of message or as a function parameter.

That leaves partition and offset. Although they are used with Kafka, and Kafka may be a common expected use case, it's not needed for a lot of other implementations, and harder to justify for inclusion in a common api used by all implementations. Every user of this api would have to go through a decision process on whether to interpret the offset parameter as a sequence, and wonder if they need to implement message reordering. If our goal is to make it easy to create lightweight WASM components, we don't want users to even have to think about message reordering - it should be in a service layer.

How can we enable WASM developers to use Kafka partitions, and QOS, and TTL, and other variations of messaging services that might be critical to their needs? In the short term, we have ways to handle variations: on the producer side, create components that accept additional service-specific parameters and pass a wrapper object as the message to the basic interface. On the consumer side, use a component that decodes the wrapper object and calls a subscriber that knows how to handle the extra parameters for receive. Similarly, wrappers can be created when subscribing, or when obtaining a broker handle.

I'm hopeful that the initial set of wasi interfaces can have a high enough level of abstraction to be broadly useful and simple to use, where api designers apply a high bar before deciding to add parameters that increase complexity for all users. Ideally, keeping these shared wasi interfaces simple will make them broadly applicable, and help adoption by shorten the learning curve for all developers.

One way to future-proof the basic api would be to allow config parameters when opening a broker. That could be an escape hatch used by wrapper components that need to pass implementation-specific parameters into the basic interface.

To summarize, I propose:

combining basic-producer, basic-consumer , stream-producer, stream-consumer into producer, consumer
combining sender, receiver into string, and including subject in each invocation of send and receive (as a function parameter or field of the message object)
removing partition, offset

This leave us with something like this:

    interface broker{
        open(config: option<metadata>) -> result<broker, error>
        publish() -> result<producer, error>
        /// subscribe to a subject or wildcard set of subjects
        subscribe(subject: string) -> result<consumer, error>
    }
    interface producer {
        use self.types.{error, message}
        send(message: message) -> result<_, error>
    }
    interface consumer {
        use self.types.{error, message}
        receive() -> result<message, error>
        receive_timeout(timeout: timeout-ms) -> result<message, error>
    }
    interface types {
        record message {
            subject: string,
            data: list<u8>,
            metadata: option<metadata>,
        }

        type metadata = list<tuple<string, string>>
        /// timeout in milliseconds
        type timeout-ms = u32

        variant error {
            timeout,
            ...
        }
    }

autodidaddict commented 1 year ago

I agree with @clemensv 's comment from last month that showing some sample code might help make writing the wit easier. I'd like to be able to do the following in my Rust wit-bindgen code:

let reply = messaging::request("service.api", my_bytes, Some(2_000))?; // wait for a reply for max of 2s

// collect pong replies, stopping after 2s or 30 replies, whichever comes first
let pong_collection = messaging::request_multi("ping", vec![], 30, Some(2_000))?;

If I want to use an envelope structure here, then it's my responsibility as the component developer to do that. The broker should remain unaware of the shape of the innards of my messages, and doesn't need to know how to decode my envelopes.

autodidaddict commented 1 year ago

@clemensv in #8 's comment section you mention that the pattern of request/reply as well as things like scatter-gather are application-specific patterns and shouldn't pollute the foundation. Would you then also assert that the latest revision's complete and abandon pair are to much like an app-level ack/nack pair? It feels like if request/reply, scatter/gather are app patterns, then so too is ack/nack.

One of the litmus tests I've been using around these wit interfaces is that if it looks like at least 80% of the consumers of the interface would likely have to write a wrapper to add a super-common pattern, we should consider including that pattern in the interface. This means the difference between being able to use wit-bindgen code off the shelf versus having to write a wrapper library.

Thoughts?

danbugs commented 1 year ago

About request/reply – While I understand that it is only one pattern of many, I do believe it's a very commonly used pattern (afaik, more than scatter/gather), and, so, it might align with the goal of wasi-cloud-core interfaces of covering 80% of user applications that use x capability.

I'll work on adding some example code to this PR for ease of review next!

autodidaddict commented 1 year ago

Yeah, I lean toward request/reply being more fundamentally required than the other patterns (like gather).

clemensv commented 1 year ago

@clemensv in #8 's comment section you mention that the pattern of request/reply as well as things like scatter-gather are application-specific patterns and shouldn't pollute the foundation. Would you then also assert that the latest revision's complete and abandon pair are to much like an app-level ack/nack pair? It feels like if request/reply, scatter/gather are app patterns, then so too is ack/nack.

Thoughts?

Settlement is a core function of all messaging systems even if it comes in slightly different shapes. And it's a direct interaction with the broker and/or peer endpoint.

Request/reply is an application-level contract that is projected through the infrastructure and not even AMQP or MQTT5, which have correlation metadata for reply paths, have firm opinions about how those are to be used at the protocol level and punt that up to the app as a problem. MQTT5 has a non-normative section about it, AMQP is completely silent beyond describing the properties.

autodidaddict commented 1 year ago

Settlement is a core function of all messaging systems even if it comes in slightly different shapes. And it's a direct interaction with the broker and/or peer endpoint.

Do MQTT and websockets support settlement?

I suppose there's no harm in an implementor of the interface doing a no-op in response to the complete/abandon requests.

If the interface doesn't support explicit request/reply semantics, then what provision is there in the interface to provide hints/metadata as to the reply mechanism? For example, most brokers that support request/reply intrinsically require you to supply the topic on which the reply is to be published.

Please note that I'm not contradicting opinions, I'm just trying to tease out multiple perspectives based on people's experiences with multiple different brokers. It's easy for me to assume "all brokers" work a certain way if I've only worked with NATS, Kafka, AWS's SQS, and RabbitMQ. I'm genuinely surprised by the notion that request/reply isn't a fundamental concept supported by all message brokers.

clemensv commented 1 year ago

Yes, MQTT does support settlement in the form of QoS 1 and Qos 2 handshakes.

WebSockets is a stream protocol with framing layered on top of HTTP and generally requires an overlaid application protocol to be used for messaging. Putting JSON into text frames is where an overlaid application protocol starts, and if you add a "got it!" reply, you've got settlement. Frameworks like socket.io or SignalR (there are many) have settlement as feature (example)

AMQP and MQTT can be layered on top of WebSockets as application protocol layer and that is quite common.

Regarding request/reply: Some of brokers you mention support request/reply as a construct over their underlying primitives. Kafka doesn't. NATS uses reply subjects and Rabbit uses reply queues, but also has a way to create reply queues on the fly. You can construct something similar with JMS.

However, the core messaging APIs generally remain async and one-way. Even for the "Direct Reply-To" feature of Rabbit, the replies are pulled from a pseudo-queue.

You would have a similar model for AMQP 1.0 in brokerless peer-to-peer mode that allows you to create bi-directional communication connection but with independent flows over opposing links. Requests and replies would be matched by the overlaid client. The messaging path is oblivious to the app-level correlation by design.

autodidaddict commented 1 year ago

Thanks for filling in the blanks @clemensv . Some aspects of this are much clearer now.

danbugs commented 1 year ago

@autodidaddict , and @clemensv ~ As requested, I've added some examples.

Also, @Mossaka , I've updated the interface to the latest WIT version for the examples, so I'll be closing #12 .

guest271314 commented 1 year ago

I think this can be as simple as Native Messaging protocol https://developer.chrome.com/docs/extensions/mv3/nativeMessaging/#native-messaging-host-protocol.

The same format is used to send messages in both directions; each message is serialized using JSON, UTF-8 encoded and is preceded with 32-bit message length in native byte order.

We don't need subject or metadata as separate fields. Just a Uint32Array containing message length followed by the message as JSON. That's it.

autodidaddict commented 1 year ago

We don't need subject or metadata as separate fields. Just a Uint32Array containing message length followed by the message as JSON. That's it.

This may be too simple of a protocol. Components need to be able to dictate the topic on which messages are published. We can't just assume that there's one catch-all channel like stdio. Without being able to indicate a publication subject/topic, a huge swath of common scenarios and use cases won't be possible for components.

guest271314 commented 1 year ago

This may be too simple of a protocol.

Simple and effective without being over-specified or over-engineered needlessly.

We can stream real-time audio using the protocol, where you and users will notice gaps and glitches in playback which can be hidden in video.

Here's the same algorithm in C, C++, JavaScript (QuickJS), and Python where it is unobservable client side which programming language(s) are being used as the host.

and some other Native Messaging hosts https://github.com/guest271314/NativeMessagingHosts.

Components need to be able to dictate the topic on which messages are published.

Then do that.

Whether using a specific name for a host, e.g., nm_c_wasm that does one specific task, echo back 1 MB that the client sent to the host https://github.com/guest271314/native-messaging-webassembly/blob/main/nm_c_wasm.json, or open-ended, e.g., the host does whatever you tell it to do

const port = chrome.runtime.connectNative('capture_system_audio');
port.onMessage.addListener((message) => {
  // Do stuff with message, e.g., [0, 255]
});
port.onDisconnect((port) => {
  if (port.error) {
    // Handle port disconnecting or error
  }
});
port.postMessage('parec -d @DEFAULT_MONITOR@');

and

let text = `So we need people to have weird new
ideas ... we need more ideas to break it
and make it better ...

Use it. Break it. File bugs. Request features.

- Real time front-end alchemy, or: capturing, playing,
  altering and encoding video and audio streams, without
  servers or plugins!
  by Soledad Penadés

von Braun believed in testing. I cannot
emphasize that term enough – test, test,
test. Test to the point it breaks.

- Ed Buckbee, NASA Public Affairs Officer, Chasing the Moon

Now watch. ..., this how science works.
One researcher comes up with a result.
And that is not the truth. No, no.
A scientific emergent truth is not the
result of one experiment. What has to
happen is somebody else has to verify
it. Preferably a competitor. Preferably
someone who doesn't want you to be correct.

- Neil deGrasse Tyson, May 3, 2017 at 92nd Street Y

It’s like they say - if the system fails you, you create your own system.

- Michael K. Williams, Black Market

1. If a (logical or axiomatic formal) system is consistent, it cannot be complete.
2. The consistency of axioms cannot be proved within their own system.

- Kurt Gödel, Incompleness Theorem, On Formally Undecidable Propositions of Principia Mathematica and Related Systems`;

const port = chrome.runtime.connectNative('capture_system_audio');
port.onMessage.addListener((message) => {
  // Do stuff with message, e.g., [0, 255]
});
port.onDisconnect((port) => {
  if (port.error) {
    // Handle port disconnecting or error
  }
});
port.postMessage({cmd:`espeak-ng -m -v Storm --stdout`, input:`"${text}"`};);

And/or when networking is used simply use static URL endpoints, or query string parameters, or use QUERY. Though HTTP(S), QUIC, etc. have all kinds of extra headers and extraneous data to parse before we get to the data.

Using the Native Messagin protocol as long as the producer is expecting whatever the consumer throws it it, we can simply call the approriate commands to stream to stdout, or pass commands or even a socket connect on to the target endpoint(s).

Or, as demonstrated above, pass whatever you want, e.g., arbitrary shell commands as a string or JSON (JavaScript Object Notation), or encoded text or other data as Uint8Array or other TypedArray serialized to the JSON format [0, 65536].

I am relatively certain you have not come up with use-cases the Native Messaging protocol cannot produce to and consume from efficiently.

guest271314 commented 1 year ago

We can't just assume that there's one catch-all channel like stdio.

Ultimately all programs write to stdio or a file, or stream or do something the user can observe.

Sure we can. It's up to the host to handle the message, or not. This is how I check if a command passed to a Deno Native Messaging host should be executed or not - in the local HTTPS server the host starts https://github.com/guest271314/native-messaging-espeak-ng/blob/deno-server/local_server.js.

const commands = new Set();
commands.add('espeak-ng');

const text = await request.json();
      const json = text.cmd.split(' ').concat(text.input);
      const program = json.shift();
      if (commands.has(program)) {
        const command = new Deno.Command(program, {
          args: json,
          stdout: 'piped',
          stdin: 'piped',
        });
        const process = command.spawn();
        const reader = process.stdout.getReader();
        body = new ReadableStream({
          async pull(c) {
            const { value, done } = await reader.read();
            if (done) {
              c.close();
            }
            c.enqueue(value);
          },
          async cancel(reason) {
            process.kill('SIGKILL');
            server = null;
            controller = null;
          },
        });
      }
      return new Response(body, responseInit);

Now I created this server to test and compare Fetch (Standard) API and Streams (Standard) API to Native Messaging locally. The client can't tell the difference. So why use a more verbose and complex communication system with multiple layers of metadata (headers, etc.)?

There are far less moving parts using the Native Messaging host only. Technically we can establish a TCP socket and use that as the communication mechanism, similar to WebSocketStream, less moving parts than Fetch; both can be bi-directional streams. Less moving parts still in a Native Messaging hosts that just does what you tell it to do, whatever that might be, and streams to stdout https://github.com/guest271314/native-messaging-deno/blob/main/nm_deno.js.

danbugs commented 1 year ago

@guest271314 ~ I think it might be helpful for you to open a separate issue highlighting how the interface should look like in accordance to NativeMessaging and how other implementors (e.g., Azure Service Bus, NATS, Kafka, Mosquitto, etc.) would utilize it to perform required actions highlighted in this interface with the simpler foundation.

danbugs commented 1 year ago

@autodidaddict , and @clemensv ~ As this PR has already been open for a while, I'm debating concluding it as is, and creating separate issues to allow for more targeted PRs, thoughts? I'd really appreciate your help with creating issues that you think are pertinent. Please, give me an OK, or let me know if there's anything you think should be concluded in this very PR.

autodidaddict commented 1 year ago

@autodidaddict , and @clemensv ~ As this PR has already been open for a while, I'm debating concluding it as is, and creating separate issues to allow for more targeted PRs, thoughts? I'd really appreciate your help with creating issues that you think are pertinent. Please, give me an OK, or let me know if there's anything you think should be concluded in this very PR.

I think merging and moving forward is probably a good idea. From what I can tell of the "current" files, the conflict between manually subscribing to a channel and returning a list of subscriptions from the configure function has been reconciled (I think we only have configure now). As long as it's internally consistent we can iterate via more issues and PRs.

I'm hoping to be able to build a prototype from this to get some "hands on" time that will potentially lead to future issues

danbugs commented 1 year ago

I got approval from all current reviewers, so I'll be merging this PR. Issues will spawn soon to target anything that wasn't resolved yet.

WebAssembly / wasi-messaging