serde-rs / serde

Serialization framework for Rust
https://serde.rs/
Apache License 2.0
8.97k stars 758 forks source link

Deserializing to a flattened struct calls `deserialize_map`, but deserializing to an already-flat struct does not #1529

Open peterjoel opened 5 years ago

peterjoel commented 5 years ago

The original problem is that my format is not completely self-describing because it does not encode field types, and I need to rely on the types to be inferred from the target struct when deserializing. However, I found that deserialize_any was being called on my Deserializer when I used the #[serde(flatten)] attribute.

I have tried to reduce this problem to the minimal necessary code, in this gist. The problem shows itself by the fact that deserialize_map is called in the failing test, but is not called in the passing test.

Deserializing to this struct works (does not call deserialize_map):

#[derive(Deserialize)]
#[serde(rename = "Message")]
struct Flattened {
    a: String,
    b: String,
}

But deserializing these to structs results in a call to deserialize_map:

#[derive(Deserialize)]
#[serde(rename = "Message")]
struct Nested {
    a: String,
    #[serde(flatten)]
    inner: NestedInner,
}

#[derive(Debug, PartialEq, Deserialize)]
struct NestedInner {
    b: String,
}

I also checked that both structures expect the following deserialize tokens:

&[
    Token::Struct { name: "Message", len: 5 },
    Token::Str("a"),
    Token::Str("v1"),
    Token::Str("b"),
    Token::Str("99"),
    Token::StructEnd,
]

Perhaps I have this wrong, but I would expect this to indicate that both visitors should interact with the deserializer identically.

Full code: https://gist.github.com/peterjoel/a41363fea48c4cb529e4a4bf8421ec20

peterjoel commented 5 years ago

This issue means that using #[serde(flatten)] has a significant performance cost, for example with serde_json. Benchmark here: https://gist.github.com/peterjoel/eafb936143a988f6a922efd2b52ebf18

In my benchmark, the struct with #[serde(flatten)] takes nearly 3 times longer to deserialize compared to the struct that is already flat.

test benches::deserialize_to_flat                            ... bench:          85 ns/iter (+/- 2)
test benches::deserialize_to_flat_and_then_convert_to_nested ... bench:          85 ns/iter (+/- 6)
test benches::deserialize_to_nested_and_flattened            ... bench:         225 ns/iter (+/- 24)
andrey-yantsen commented 4 years ago

Seems to be related to https://github.com/RReverser/serde-xml-rs/issues/112 as well: it's hard to implement a proper deserialiser, which works with #[serde(flatten)], for formats without strict schema (like xml).