flavray / avro-rs

Avro client library implementation in Rust
MIT License
169 stars 95 forks source link

Deserializing Avro unions to Rust enums #192

Open codehearts opened 3 years ago

codehearts commented 3 years ago

I saw the announcement regarding Apache taking ownership of this repo, so let me know if I should move this to their bug tracker!

I'm using rsgen-avro to generate types for my Avro schemas, which for unions like ["null", "int", "long"] creates this in Rust:

#[derive(Debug, PartialEq, Clone, serde::Deserialize, serde::Serialize)]
pub enum UnionIntLong {
    Int(i32),
    Long(i64),
}

#[derive(Debug, PartialEq, Clone, serde::Deserialize, serde::Serialize)]
pub struct Event {
    pub my_field: Option<UnionIntLong>,
}

null deserializes to None, but int/long yield a "not an enum" error rather than Some(UnionIntLong::Int/Long(…)).

I understand why this happens:

  1. deserialize_enum is called on the avro-rs Deserializer
  2. The current Value token is Int or Long, not Enum, thus the error

I tried manually implementing Deserialize, but there's no way (afaik) to tell what Value the current token is. The best I can do is call deserialize_any, which only works for int/long/float/double/boolean types, not strings.

I opened an rsgen-avro issue about this and the author suggested an EnumUnionDeserializer in avro-rs that can deserialize enum variants with names like Int or Long properly. It's not ideal because avro-rs would have special rules based on the naming of enum variants, but it's the only way I've found that can deserialize these types correctly (and the only way to deserialize Avro date to a Date variant instead of Int, since they're both i32 values).

I have a reproduction of my original issue if that's of any help in illustrating what I'm talking about. I'm curious if this idea has any merit or if it's not something avro-rs should implement. Thanks for your time!