serde-rs / serde

Serialization framework for Rust
https://serde.rs/
Apache License 2.0
9.11k stars 771 forks source link

Support internally tagged enums with non-self-describing formats. #2187

Open kevincox opened 2 years ago

kevincox commented 2 years ago

Right now deserialization of internally tagged enums rely on a call to serde::Deserializer::deserialize_any which often doesn't work well with non-self-describing formats. Ideally serde::Deserializer::deserialize_struct would be called instead so that the Deserializer knows what to do.

https://github.com/serde-rs/serde/blob/7e19ae8c9486a3bbbe51f1befb05edee94c454f9/serde_derive/src/de.rs#L1340-L1342

Lucretiel commented 2 years ago

I think the trouble you'll run into here is, what do you pass as fields? deserialize_struct looks like this:

   fn deserialize_struct<V>(
        self,
        name: &'static str,
        fields: &'static [&'static str],
        visitor: V,
    ) -> Result<V::Value, Self::Error>
    where
        V: Visitor<'de>;

Where the fields are the set of field names in the destination struct. For an internally tagged struct, unless all the variants have precisely the same set of field names, there's no valid list of string to pass to fields, since the set of fields varies depending on which variant it ends up being.

kevincox commented 2 years ago

That is a good point. Maybe deserialize_map could be used? It is generally treated as similar to struct without knowing the fields upfront. However I think this is still problematic because if you don't get the tag field first as you won't know which type to deserialize the fields as.

I guess this is just a limitation of internal tagging? Maybe it should be called out in the docs.

jonasbb commented 2 years ago

You can deserialize internally tagged enums from sequences, so forcing deserialize_map is not an option. The enum below can be deserialized from these JSONs ["Bool", false] and ["Int", 123].

#[derive(Deserialize, Serialize, Debug)]
#[serde(tag = "tag")]
enum Internal {
    Int{i: i32},
    Bool{b: bool},
}
kevincox commented 2 years ago

Oh, I didn't realize that was supported. IIUC though the serializer always produces structs/maps though? But that would prevent fixing this as it would break backwards compatibility. Maybe for serde 2 😅

Mingun commented 1 year ago

You can deserialize internally tagged enums from sequences, so forcing deserialize_map is not an option.

Actually, calling specific deserialization method is just a hint what data is expected, deserializer is not obliged to follow it. It is free to call any Visitor method it sees fit