flavray / avro-rs

Avro client library implementation in Rust
MIT License
169 stars 95 forks source link

How to determine which branch of a union was used #180

Open jeroentervoorde opened 3 years ago

jeroentervoorde commented 3 years ago

I've a schema that wraps a record type in a union that all have the same layout. They only differ on record name. For instance: Added { payload: Record} Deleted { payload: Record} Updated { payload: Record}

I can parse these records and that gives me a Union(Record[payload: Record]) but i can't find a way to determine if it's added, delete or updated.

Is this a limitation of the library and, if so, what do you think about adding a reference to the schema or schema name to the Record struct to make this possible? An alternative would be to add the index of the branch to Union so i can resolve it myself using the reader schema (a bit less user friendly but this won't cause any ownership issues so it might be easier to implement)

Both options may break serde deserialization as well i guess. Any ideas about that?

flavray commented 3 years ago

This looks like one of the recurring issues we've had reported, and still have no fix for this (yet!), after a few attempts :(

Issues #61 #95 Previous attempt #90

Let me know if you were referring to something else. 🙂

If you want to have a go at this, I'd be happy to review it any time you have something (even if not complete)! If not, I'll give it a go later on.

Regarding the implementation, I think adding the index in the Record would be a decent solution. We wouldn't have to deal with lifetime/cloning shenanigans, and we should have all the building blocks ready for that (albeit quite a lot of work is required to make it happen 😄)

lerouxrgd commented 3 years ago

How about using a dedicated enum as follows:

struct Record;

struct Added {
    payload: Record,
}

struct Deleted {
    payload: Record,
}

struct Updated {
    payload: Record,
}

enum UnionOperation {
    Added(Added),
    Updated(Updated),
    Deleted(Deleted),
}
jeroentervoorde commented 3 years ago

@flavray

No, i think that's the same issue. Sorry about that :) Thanks for the pointers. That'll be very useful.

I'd like to take a stab at this but i intend to wait until https://github.com/flavray/avro-rs/pull/99 is merged. If i can do something to help there please let me know.

@lerouxrgd

This is indeed how i intend to deserialize this into a rust type but the problem I'm running into now is that the intermediate model (the Value::Union and Value::Record specifically) do not contain the information that would be needed to create to right branch of my enum so that i want to change first.

I assume that I'll also need to change the serde deserialization code to use the additional information added to Union or Record to get that working.

I think I can either match the avro record name to the rust enum branch name or use something like

[serde(tag = "recordName")] (as described here https://serde.rs/enum-representations.html) if the rust name doesn't match the avro name. My schema contains fully qualified names for instance that I'd like to support.