serde-rs / json

Strongly typed JSON library for Rust
Apache License 2.0
4.7k stars 536 forks source link

Unrecoverable JSON Deserialization Error on Unexpected Enum Variant: "trailing characters" #1142

Closed mjpauly closed 3 weeks ago

mjpauly commented 3 weeks ago

I'm using an ok_or_default deserializer function (originally from here) to contain deserialization errors with any badly-formed inner fields, replace their values with defaults, and allow well-formed fields and the top-level struct to still deserialize correctly.

This works most of the time with JSON input, but there's a case where an error deserializing an inner field can cause the entire deserialization to fail, despite the inner field being deserialized with ok_or_default and the input having valid JSON syntax. This seems to happen when:

Here is a minimal example, and playground link. The only difference between the works2 and fails1 cases are the ordering of the fields of the inner object.

Pardon me if this is expected behavior or has been discussed elsewhere already.

fn ok_or_default<'de, D, T>(d: D) -> Result<T, D::Error>
where
    T: Deserialize<'de> + Default,
    D: serde::Deserializer<'de>,
{
    Ok(T::deserialize(d).unwrap_or_default())
}

#[derive(Debug, Default, Serialize, Deserialize)]
pub struct Outer {
    // If we encounter an error deserializing `inner`, use its default value.
    #[serde(deserialize_with = "ok_or_default")]
    pub inner: Inner,
}

#[derive(Debug, Default, Serialize, Deserialize)]
pub struct Inner {
    pub bool_field: bool,
    pub enum_field: MyEnum,
}

#[derive(Debug, Default, Serialize, Deserialize)]
pub enum MyEnum {
    #[default]
    One,
    Two,
}

fn main() {
    // works1: complete and valid data                  -> correctly deserializes
    // works2: invalid enum variant after bool field    -> gets default for Inner
    // works3: valid data with enum before bool field   -> correctly deserializes
    // fails1: invalid enum variant before bool field   -> fails to deserialize Outer
    let works1 = r#"{"inner":{"bool_field":true,"enum_field":"Two"}}"#;
    let works2 = r#"{"inner":{"bool_field":true,"enum_field":"Unexpected"}}"#;
    let works3 = r#"{"inner":{"enum_field":"Two","bool_field":true}}"#;
    let fails1 = r#"{"inner":{"enum_field":"Unexpected","bool_field":true}}"#;
    dbg!(serde_json::from_str::<Outer>(works1).unwrap());
    dbg!(serde_json::from_str::<Outer>(works2).unwrap());
    dbg!(serde_json::from_str::<Outer>(works3).unwrap());
    dbg!(serde_json::from_str::<Outer>(fails1).unwrap()); // panic!
}

Output:

[src/main.rs:48:5] serde_json::from_str::<Outer>(works1).unwrap() = Outer {
    inner: Inner {
        bool_field: true,
        enum_field: Two,
    },
}
[src/main.rs:49:5] serde_json::from_str::<Outer>(works2).unwrap() = Outer {
    inner: Inner {
        bool_field: false,
        enum_field: One,
    },
}
[src/main.rs:50:5] serde_json::from_str::<Outer>(works3).unwrap() = Outer {
    inner: Inner {
        bool_field: true,
        enum_field: Two,
    },
}
thread 'main' panicked at src/main.rs:51:48:
called `Result::unwrap()` on an `Err` value: Error("trailing characters", line: 1, column: 55)
mjpauly commented 3 weeks ago

Seems to work fine if the enum field is itself annotated with ok_or_default.

#[derive(Debug, Default, Serialize, Deserialize)]
pub struct Inner {
    pub bool_field: bool,
    #[serde(deserialize_with = "ok_or_default")]
    pub enum_field: MyEnum,
}

Closing, since this fixes the issue.