jonasbb / serde_with

This crate provides custom de/serialization helpers to use in combination with serde's `with`-annotation and with the improved `serde_as`-annotation.
https://docs.rs/serde_with
Apache License 2.0
663 stars 72 forks source link

[Feature request] Deserialization that only uses `FromStr` as a fallback #702

Closed Scripter17 closed 6 months ago

Scripter17 commented 8 months ago

(Putting this here instead of the official serde repo because that one has an issue from 2018 that still hasn't been resolved.)

The serde documentation provides a way to tell a struct to deserialize a field from either FromStr::from_str or Deserialize::deserialize, but as far as I can tell the only ways to have the type in the field always have this behaviour is to manually implement deserialize or have a dummy type.

DeserializeFromStr almost handles this, but trying to derive both this and serde's normal Deserialize doesn't work.

For my particular use case, I have a struct that does lazy regex stuff and I would quite like to deserialize it from both {"pattern": "...", ...} and "..." without having to put #[serde(deserialize_with = "string_or_struct")] every time my wrapper appears in a field.

Effectively I want something like this:

use serde_with::DeserializeFromStrOrStruct;
use serde::Deserialize;

#[derive(DeserializeFromStrOrStruct)]
struct RegexWrapper {
    ...
}

impl FromStr for RegexWrapper {...}

#[derive(Deserialize)]
struct ThingThatContainsRegex {
    // Note the lack of any #[serde(deserialize_with = "...")]
    blah: RegexWrapper,
    ...
}
jonasbb commented 8 months ago

It is possible to write code that does what you want. From a code perspective, the best place to add it would be the serde_derive macro, since that would allow placing the visit_map and visit_str functions next to each other, without any wrappers. You can write your own Deserialize implementation and then delegate to derive generated code using MapAccessDeserializer. That deserialize implementation would be possible to be generated by a derive macro.

With a small modification, it should be possible to have that as a derive macro. Generate a RegexWrapperRemote as an exact copy of RegexWrapper, apply the #[serde(remote="...")] to it, and use it in the Deserialize implementation.

However, I am not sure if this StringOrStruct derive is that important overall.


/*
[dependencies]
serde.version = "*"
serde.features = ["derive"]
serde_json = "*"
*/

use serde::Deserialize;
use serde::Deserializer;
use std::str::FromStr;

#[derive(Debug, Deserialize)]
#[serde(remote = "Self")]
struct RegexWrapper {
    pattern: String,
}

impl FromStr for RegexWrapper {
    type Err = String;

    fn from_str(s: &str) -> Result<Self, Self::Err> {
        Ok(Self {
            pattern: s.to_string(),
        })
    }
}

impl<'de> Deserialize<'de> for RegexWrapper {
    fn deserialize<D: Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
        struct V;

        impl<'de> serde::de::Visitor<'de> for V {
            type Value = RegexWrapper;

            fn expecting(&self, _f: &mut std::fmt::Formatter<'_>) -> Result<(), std::fmt::Error> {
                todo!()
            }

            fn visit_str<E: serde::de::Error>(self, s: &str) -> Result<Self::Value, E> {
                FromStr::from_str(s).map_err(|e| E::custom(e))
            }

            fn visit_map<M: serde::de::MapAccess<'de>>(
                self,
                map: M,
            ) -> Result<Self::Value, M::Error> {
                RegexWrapper::deserialize(serde::de::value::MapAccessDeserializer::new(map))
            }
        }

        deserializer.deserialize_any(V)
    }
}

#[derive(Debug, Deserialize)]
struct ThingThatContainsRegex {
    blah: RegexWrapper,
}

fn main() {
    let _ = dbg!(serde_json::from_str::<ThingThatContainsRegex>(
        r#"{"blah": "foobar"}"#
    ));
    let _ = dbg!(serde_json::from_str::<ThingThatContainsRegex>(
        r#"{"blah": {
            "pattern": "foobar"
        }}"#
    ));
}
Scripter17 commented 8 months ago

Finally got around to trying this and, uh, what?

How does this not cause a "trait implemented twice" error, or any error for that matter. How does #[serde(remote = "Self")] not explode into a stack overflow? Is this supposed to work or is it just a consequence of weird implementation details? How the hell does this just magically work???

And perhaps most importantly and answerably, is there a way to make the serialize derive macro not break with this? I do also need it to work with enums if that matters.

Oh, by the way, thank you so much. I was not ready to handle writing a deserializer for HashMap<String, OtherTypeWithSameFromStrThing> and this is a lifesaver.

jonasbb commented 8 months ago

The remote implementation does not create an impl Deserialize. Instead it adds a deserialize directly to the RegexWrapper struct. That means if you write RegexWrapper::deserialize you call the inherent function first and you need additional qualification if you want to call the deserialize function of the impl Deserialize for RegexWrapper. The impl Deserialize just calls the inherent function, avoiding any recursion issues.

The Serialize will "break" and there is no way to avoid it, as it will now only create an inherent function. You can create a minimal impl Serialize which just forwards to that function.

This remote = "Self" is a special case, but it can come in handy if you want to use the generated implementation but adjust them a bit. You can use cargo expand (available on the playground) to see the code after macro expansion.