liftbridge-io / liftbridge-api

Protobuf definitions for the Liftbridge gRPC API. https://github.com/liftbridge-io/liftbridge
Apache License 2.0
15 stars 13 forks source link

Protocol Changes #8

Closed llchan closed 4 years ago

llchan commented 4 years ago

This PR is a proposal for two protocol-level changes. Some stuff is not pushed yet, just getting this PR started.

llchan commented 4 years ago

A few comments and open questions:

tylertreat commented 4 years ago

To answer your questions...

I think it makes sense to make the Message as lean as possible, and additionally keep it immutable through its lifecycle. For publishing, we can include ack inbox/policy etc. in the publish request, and for subscriptions we include metadata in what I'm calling SubscriptionMessage.

I'm not sure we can do this. What of the case where you publish a Message directly to NATS, not through Liftbridge's Publish API?

I'm not sure reply and message headers need to be in Liftbridge. Those seem like things that should live inside the value (aka payload). The key makes sense, because it is used for log compaction, but the others seem application-level to me. I think we should keep Liftbridge as lean as possible unless it needs to know about these fields. Thoughts?

Reply is a NATS concept which Liftbridge includes on the message in order to expose the reply on the original NATS message. This is mainly for cases where publishers are publishing plain, Liftbridge-agnostic NATS messages. We want to expose the info set on the original message.

Headers I'm torn on since this was a frequently requested feature in NATS Streaming and Kafka supports it.

llchan commented 4 years ago
tylertreat commented 4 years ago

The first note about refactoring the Message is mostly internal to the client/server encoding. The user would still interact with the Publish API or NewMessage function, it would just be packed differently in the flatbuffers, such that the Message is self-contained and contains no pointers to anything outside of it (i.e. it is memcpy-able opaquely). Perhaps this will be more clear once I push up the next batch.

I think I follow now.

Now that I think about it, if we ever want to do application-agnostic header-based filtering/routing we would need that so we should leave that possibility open.

Yeah, that is what I was thinking as well. Headers give us a lot of flexibility to do interesting things in the future. Routing/filtering is a good example.

llchan commented 4 years ago

I'm also starting to look at the on-disk encoding, which I think can be unified with the wire protocol. One thing I didn't quite understand: what are Attributes? It doesnt currently exist in the wire protocol, just want to make sure I'm not missing something.

tylertreat commented 4 years ago

One thing I didn't quite understand: what are Attributes? It doesnt currently exist in the wire protocol, just want to make sure I'm not missing something.

The thought was to use attributes for flags such as compression, e.g. a compression codec. The on-disk protocol could definitely use some cleaning up though.

llchan commented 4 years ago

Ah got it, good idea. We can probably wrap the messages in a on-disk flatbuffers message, so that the only things we need to manually encode are the size and CRC-32. I think that will be more future-proof in the event that we want to add more metadata to the message (e.g. compression algorithm). Might also be good to add a header to the segment for segment-level configuration---arguably compression would be able to do more with large batches of messages, vs just within a single message. Anyways I'm kind of digressing, should keep this PR focused on the client-facing API.

llchan commented 4 years ago

Also I'm overseas for a work trip at the moment but I'll try to carve out some time this weekend to get some of the remaining things I have in my worktree pushed up. Hopefully we can get this closer to merge-ready sooner rather than later so everyone has some time to absorb the changes.

llchan commented 4 years ago

Hey sorry didn't get a chance to take a look while I was away but I'm back now so will give this some attention over the next few days.

tylertreat commented 4 years ago

@llchan Wondering what the state of this is?

llchan commented 4 years ago

Was running around a bit for Thanksgiving + a conference, but can pick up on this tomorrow. Last I left off I was reading through the existing on-disk serialization code for the liftbridge server. The implementation there may influence the way we should structure our messages---ideally we can memcpy blobs without parsing/unpacking, and possibly keep the door open for compression/mmap in the future. I think the current state of the PR already includes some adjustments towards this end, but will need to review.

tylertreat commented 4 years ago

Closed in favor of https://github.com/liftbridge-io/liftbridge-api/pull/14.