samscott89 / serde_qs

Serde support for querystring-style strings
Apache License 2.0
193 stars 67 forks source link

Serialization of Vec<Option<String>> implicitly skips None variants #81

Open zexa opened 1 year ago

zexa commented 1 year ago

Seems like we're skipping serialization on None values, which is kind of unexpected and causes wrong ser/de when None values are required (i.e. to handle in backend). See bellow example:

[package]
name = "serde-url-example"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
serde = { version = "1.0.160", features = ["derive"] }
serde_qs = "0.12.0"
use serde::{Deserialize, Serialize};

#[derive(Debug, Clone, Serialize, Deserialize)]
struct Example {
    opt_str_array: Vec<Option<String>>,
}

fn main() {
    let example = Example {
        opt_str_array: vec![None, Some("abc".to_string())],
    };

    let encoded = serde_qs::to_string(&example).unwrap();
    println!("{:?} - {:?}", example, encoded);

    let decoded: Example = serde_qs::from_str(&encoded).unwrap();
    println!("{} - {:?}", encoded.clone(), decoded);
}
➜ cargo run
   Compiling serde-url-example v0.1.0 (/home/zexa/Projects/serde-url-example)
    Finished dev [unoptimized + debuginfo] target(s) in 0.45s
     Running `target/debug/serde-url-example`
Example { opt_str_array: [None, Some("abc")] } - "opt_str_array[1]=abc"
opt_str_array[1]=abc - Example { opt_str_array: [Some("abc")] }

I guess representing None values is an issue. Unclear what kind of string should be used for them. At the very least I'd expect the deserializer to know that if there's opt_str_array[1] and opt_str_array[0] was skipped, that must mean that opt_str_array[0] was None.

Additionally, I suppose it would be benefitial to know the amount of items in such cases too. I.e. if it was opt_str_array: vec![None, Some("abc".to_string(), None)], we could do opt_str_array[1]=abc&opt_str_array[2] where opt_str_array[2] would be used to denote the size of the vec.

samscott89 commented 1 year ago

Hey @zexa, thanks for opening an issue.

You're right, that's not I case I had considered previously. The problem is that we have generic logic for handling Option<T> that says: "if the value is None, skip it".

Do you have an existing API you are trying to conform to? If so, can you provide information on what values that would expect to get? Or how would you expect this to work?

I could imagine special-casing this so you woudl get opt_str_array[0]&opt_str_array[1]=abc&opt_str_array[2] -- I think that would deserialize correctly, and would make sense from an serialization standpoint.

What might be the simplest approach for you, as a workaround, would be to implement custom serialize/deserialize logic to do something similar?

samscott89 commented 1 year ago

For example, I think a custom serializer could be:

fn serialize_opt_Str<S>(opt_str_array: &[Option<String>], S) -> Result<S::Ok, S::Error> where S: Serializer {
        let mut seq = serializer.serialize_seq(Some(self.len()))?;
        for e in opt_str_array {
            match e {
                 Some(s) => seq.serialize_element(s)?,
                 None = seq.serialize_element(&())?,
            }
            seq.serialize_element(e)?;
        }
        seq.end()
}

along with:

#[derive(Debug, Clone, Serialize, Deserialize)]
struct Example {
    #[serde(serialize_with="serialize_opt_str")]
    opt_str_array: Vec<Option<String>>,
}
zexa commented 1 year ago

It's likely that we'll stick to your suggested approach, which is to use a custom serializer/deserializer, but to be honest I'd like something generic too - it would be an overall improvement to the lib.

Personally, I'd be satisfied with the provided serializer/deserializer.

Do you have an existing API you are trying to conform to? If so, can you provide information on what values that would expect to get?

I'm hesitant to provide the exact code as it's proprietary. The example provides enough context. Unless you need some specific information?

In which case order is important here. None's must be represented. Empty strings must be represented.

I could imagine special-casing this so you would get opt_str_array[0]&opt_str_array[1]=abc&opt_str_array[2] -- I think that would deserialize correctly, and would make sense from an serialization standpoint.

If you consider a Vec of a lot of elements then that's a lot of characters just to represent a None, which is something to mind when you consider the URL length limit. But I guess at that point it would be smarter to consider other ways of communicating a payload.

On the other hand, I did consider an interesting approach - opt_str_array[5]=[0]nonempty,[1],[3] which would deserialize into Vec(Some("nonempty"), Some(""), None, Some(""), None, None). 5 being the size of the vec. If we know the size of the vec we can assume that indexes that were not mentioned would be None's. The novelty of this approach is that it saves a few characters.