Use Cases for a "self describing postcard"

jamesmunns / postcard

A no_std + serde compatible message library for Rust

Apache License 2.0

943 stars 91 forks source link

Use Cases for a "self describing postcard" #92

Open jamesmunns opened 1 year ago

jamesmunns commented 1 year ago

TL;DR: I want to hear from you if you have ever needed postcard to do something ("something" is defined below) that it doesn't today.

Background

Postcard is generally very efficient on the wire, partly because it is not "self describing" - the messages themselves give no hint or expectation on how they are to be deserialized.

In optimal cases, where both sides of the communication are Rust, and use the same serde representation/type definition (e.g. - they share a common "types" crate that defines the wire types), this is great, and both sides understand each other.

However there are some sub-optimal cases:

The schema changes over time, leaving an "old device" with an "old schema" unable to communicate because the wire format has changed, and more specifically - unable to reliably DETECT that the wire format has changed
It is desirable to have non-Rust code perform the deserializing or serializing of a given message type
One sender or receiver is not aware of the schema at compile time, and would like to serialize or deserialize message types "not known" until runtime, either for logging or actual use.

Today

I'm currently looking into ways it would be possible to augment postcard data with schema information, so the "sub-optimal" cases listed above could be handled.

To be clear - postcard's core format will not change.

Ideally this would be an "optional add-on" - something you can use contextually, sometimes even after the fact, to enable those suboptimal cases, or as "extra metadata" you could send either with every message, or "on first connection", or "on request", or whatever makes sense for your link budget.

If that isn't possible, I'd probably look into making this a second crate "inspired by postcard", which can be used when a little more overhead is worth the flexibility.

That being said - I'm trying not to focus as much on "how" to make this possible yet, and instead looking at "what is needed". Discussions of "how to do this" are out of scope for this issue's comments.

What I need

Instead of blindly implementing what I THINK would be useful (to me, at least), I'd like to hear from folks who have run into the sub-optimal cases above, or even ones that I didn't list above. This will help me make sure whatever I end up researching/implementing covers the actual needs/gaps in today's postcard.

Ideally, I'd like to keep this discussion public, but I am also willing to have a private chat via email or matrix (contact info on my profile, or ask here), and I am willing to sign/provide an MNDA to discuss any proprietary usage that might benefit from changes like the ones proposed.

Thank you!

jamesmunns commented 1 year ago

Also: If you think you might want this or want to try this out before it releases, feel free to sound off here as well. I'll keep you in the loop whenever I have something ready to try.

jamesmunns commented 1 year ago

Here's some prior art that was shown for me (a different way of generating the schema): https://docs.rs/serde-reflection/latest/serde_reflection/

It's good to see their Serde Data Model types match fairly 1:1 with what we came up with.

Pros: It doesn't require a second derive Cons: It can't be built at compile time

This issue is more about what to do with that schema, but I should probably review their "Features and Limitations", as we will likely have similar constraints.

jeromegn commented 1 year ago

I'm not sure if the following thoughts are related, but here's what I faced recently:

We accept client input as JSON, but we store in postcard's format for space efficiency and performance.

What surprised me at first was that, even though postcard uses serde's derives, it doesn't support everything serde does for other formats. I get that there's no guarantees that a format would implement all serde features, but there are some surprises (e.g. #[serde(untagged)] will only error when deserializing, not when serializing).

This also means that we can only support the lowest common set of features for the formats we support (that's just json and postcard right now). Offering an untagged enum for a client API feels nice for forward compatibility, but we can't use the same types since that's unsupported by postcard.

We can't make our schema evolve unless we use enums for everything

Given there's no support for untagged enums, unless we've used enums from the very beginning to version every input/output, we can't make our schema evolve. Even if we decided to move from a non-enum to an enum, this is breaking change.

That's because we store our data in a persistent store and retrieving it (deserializing it in the process) is not possible if the types have changed in any way.

Can't append new fields to a struct

This is the big reason why we need to version things: we can't add new fields, even if they have default values. I'm looking at alternatives formats that would at least allow that. For example speedy supports default_on_eof. That's likely not something postcard could support because it is bound to serde's API. It would have to either make a breaking change (though it's kind of an additive change, not sure if it should be considered breaking) or switch to non-serde.

For context, we're accepting client data as JSON, we're exchanging data between nodes and storing data as postcard's format and we're also storing data in SQLite. When we add new columns in SQLite, we have to either give it a default value or make it nullable. This works well w/ new struct fields with #[serde(default)], but that doesn't work (adding fields at the end of a struct) with postcard, forcing us to create a new "version" of our schema and adding a lot more logic to handle all versions.

jamesmunns commented 1 year ago

Hey @jeromegn - thanks for the input, and particularly reminding me about the limitations around serde tunables like untagged, flatten, etc.

I don't know if I have any answers yet, but these are really good data points, so I appreciate it!

At the moment with the "schema on the side" approach, I do expect "deserializing with schema" to be slower than "deserializing without schema", just because it has to do more. I have no idea the order of magnitude increase tho.

For use cases in a database, it might be possible to do an "upgrade" approach, either as part of a migration or "update on access" to switch to the newest schema when you run into old schemas, but I don't have a great story around that yet. Mostly I don't want to make postcard "worse" for existing users who are fine with the current limitations, which means I'm sorta limited to either doing something "on the side", or to make a different library "inspired" by postcard which is more flexible, at some perf/size cost. This would bring it more in line with things like cbor or protobuf.

CBJamo commented 1 year ago

LoRaWan

If you're not familiar with lorawan and want to know the more about it, TTN has great docs. For this discussion all you really need to know is that data rates are low (~1-22kb/s), and devices communicate directly and exclusively with a gateway.

Currently, packets are either comply with the CayenneLPP spec, or are hand crafted on the device, and hand parsed by a "codec" on the gateway. The codec is, per that spec, written in JS. Cayenne is pretty nice, but if your application doesn't fit into it's mold then the fallback of hand crafting packets is pretty dire.

I see two ways to make this situation better. The first would be to extend the gateway to allow webasm binaries in addition to JS. This would be similar to how JS is used now, except you could use postcard as it currently exists and get around hand-crafting packets. But, that's an aside as far as this issue is concerned.

The other solution would be something like a self describing postcard. The gateway would still need to be extended. That's not a big problem though, because the de-facto[1] standard gateway, ChirpStack was recently rewritten in rust. Making adding this feature (once it exists in postcard) nearly trivial. This is similar to Cayenne, but much more flexible. The big downside is that, at least currently, the codec stores no state. The gateway does store state for each device though, so it might be possible to store the description there.

For this application, additional size cost is of the concern, but the perf cost is negligible. Lorawan devices don't typically uplink data more often than once a minute.

[1] The two biggest public networks are TTN and Helium. TTN uses chirpstack, and Helium is planning on moving to chirpstack. AFAIK, most ISPs that deploy a lorawan network also use chirpstack, but I don't know if that's always true.

therealfrauholle commented 1 year ago

A usecase we have in practice that was not mentioned here is that we also search for a way to just "hash" the schema in a cryptographic manner, so we don't necessarily want to understand the schema. In this way we can give postcard data a stronger typing and assert that postcard data has the semantics we expect. JSON is more advantageous in this sense because named fields give a little bit more guarantees towards the semantic of the data.

Thanks for the hint about the serde_reflection crate! I did some more research and saw that supporting schemas in relation to serde has already been discussed a few times, e.g. see https://github.com/serde-rs/serde/issues/345 which is about proposing a generalized way to create schemas in serde. In https://github.com/serde-rs/serde/issues/1785#issuecomment-624493760 a few very interesting crates (also serde_reflection) are mentioned, most notably schemars.

I believe that this ultimately requires general support (not wanting to say "serde support"). But I believe serde should provide a way to walk across the AST of a serde structure. Protocol implementation can then provide a schema generator that infers a postcard schema, json schema or (for our usecase) a schema "hash". I recon that in combination with const_trait_impl, see https://github.com/rust-lang/rust/issues/67792, this crate's MAX_SIZE can also be implemented without a macro.

I understand that such a thing has not been accepted into serde because it is hard to get right. It could be a strategy to align this crates Schema implementation with the implementation of schemars, find common patterns and then hopefully bring these into serde as a general concept.

jamesmunns commented 1 year ago

@therealfrauholle for reference, the experimental schema capabilities of postcard here: https://docs.rs/postcard/latest/postcard/experimental/schema/index.html, DOES support Hash (edit: on the generated schema field), and you likely could come up with your own cryptographic way of hashing the schema if the default hasher doesn't fit your needs.

edit: you could send this hash as part of the "header" or "ID" of a message type to ensure coherence.

The largest reason this hasn't stabilized yet is that I haven't decided whether the schema should hash for JUST "structural" typing or "structural AND nominal" typing.

As an example:

// A - base case
struct Example {
    temp: f32,
    humidity: f32,
}

// B - Type name changed
struct ExaMple {
    temp: f32,
    humidity: f32,
}

// C - fields reordered, but type sequence still the same
struct Example {
    humidity: f32,
    temp: f32,
}

// D - one field renamed, no semantic or structural change
struct Example {
    temperature: f32,
    humidity: f32,
}

Which of these structs should be "the same schema"? If we JUST use structural typing, they are ALL the same (basically: (f32, f32)).

If we only look at nominal typing of the FIELDS, A + B would be equivalent, but none of the others are.

If we look at ALL nominal typing, NONE would be equivalent.

Chances are, the best option is to pick "nominal and structural of types and fields" as the default, but document how someone could implement something different.