media-io / yaserde

Yet Another Serializer/Deserializer
MIT License
174 stars 58 forks source link

Deserialize Vec<EnumOfStructs> doesn't populate Vec #159

Closed jac-cbi closed 1 year ago

jac-cbi commented 1 year ago

I have an existing XML file format I'm attempting to deserialize into strongly-typed Rust data structures. Here's my unit test:

#[cfg(test)]
mod tests { 
    #[test]
    fn simple_reproduction_of_error() {
        #[derive(Debug, Default, YaDeserialize, PartialEq)]
        struct A {
            #[yaserde(attribute)]
            id: u64,
            name: String,
        }
        #[derive(Debug, Default, YaDeserialize, PartialEq)]
        struct B {
            #[yaserde(attribute)]
            id: u64,
            time: String,
        }
        #[derive(Debug, Default, YaDeserialize, PartialEq)]
        struct C {
            #[yaserde(attribute)]
            id: u64,
            count: u64,
            when: B,
        }

        #[derive(Debug, Default, YaDeserialize, PartialEq)]
        enum Stuff {
            A(A),
            B(B),
            C(C),
            #[default]
            E,
        }

        #[derive(Debug, Default, YaDeserialize, PartialEq)]
        struct Base {
            X: bool,
            Y: String,
            stuffs: Vec<Stuff>,
        }

        let src = r#"
            <Base>
              <X>true</X>
              <Y>This is a string</Y>
              <C id=1>
                <count>11250</count>
                <when>
                  <time>11:30</time>
                </when>
              </C>
              <B id=2>
                <time>0600</time>
              </B>
              <A id=3>
                <name>Bob</name>
              </A>
            </Base>
        "#;

        let base: Base = from_str(&src).unwrap();
        assert_eq!(base.stuffs.len(), 3);
    }
}

And here's the error from cargo test simple_reproduction_of_error

---- tests::simple_reproduction_of_error stdout ----
thread 'tests::simple_reproduction_of_error' panicked at 'assertion failed: `(left == right)`
  left: `0`,
 right: `3`', libXXXX/src/lib.rs:127:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I've run with RUST_LOG=debug it seems to parse the whole XML just fine. I also dumped the resulting struct, and the Vec is indeed empty.

What am I doing wrong here?

NOTE: I've also added X and Y to enum Stuff vice struct Base and it didn't work either.

EDIT: added id attribute to clarify the reason for these elements being grouped together into a structure of enums.

jac-cbi commented 1 year ago

Neat. Just found cargo expand. It seems the generated code has match arms for X, Y, and stuffs. Which clearly doesn't work. I can't use rename because it can be any value of the enum. I tried adding root = "stuffs" to each field of the enum, no joy. any thoughts?

jac-cbi commented 1 year ago

Looking at the generated code, it appears there is a catch-all match arm which should be passing on to the deserializer for the enum? As opposed to triggering an error for an unexpected XML element?

MarcAntoine-Arnaud commented 1 year ago

Hello,

Idealy: stuffs: Vec<Stuff>, requires to be a flatten like

#[yaserde(flatten)]
stuffs: Vec<Stuff>,

but it's not implemented.

So the unique way today is to handle enum variants it the main object.

Here the sample code that allow you to parse the current XML:

#[cfg(test)]
mod tests {
  #[test]
  fn simple_reproduction_of_error() {
    #[derive(Debug, Default, yaserde_derive::YaDeserialize, PartialEq)]
    struct A {
      #[yaserde(attribute)]
      id: u64,
      name: String,
    }
    #[derive(Debug, Default, yaserde_derive::YaDeserialize, PartialEq)]
    struct B {
      #[yaserde(attribute)]
      id: u64,
      time: String,
    }
    #[derive(Debug, Default, yaserde_derive::YaDeserialize, PartialEq)]
    struct C {
      #[yaserde(attribute)]
      id: u64,
      count: u64,
      when: B,
    }

    #[derive(Debug, Default, yaserde_derive::YaDeserialize, PartialEq)]
    struct Base {
      #[yaserde(rename = "X")]
      x: bool,
      #[yaserde(rename = "Y")]
      y: String,
      #[yaserde(rename = "A")]
      a: Vec<A>,
      #[yaserde(rename = "B")]
      b: Vec<B>,
      #[yaserde(rename = "C")]
      c: Vec<C>,
    }

    let src = r#"
      <Base>
        <X>true</X>
        <Y>This is a string</Y>
        <C id="1">
          <count>11250</count>
          <when>
            <time>11:30</time>
          </when>
        </C>
        <B id="2">
          <time>0600</time>
        </B>
        <A id="3">
          <name>Bob</name>
        </A>
      </Base>
    "#;

    let base: Base = yaserde::de::from_str(&src).unwrap();
    println!("{:?}", base);
    assert_eq!(base.a.len(), 1);
  }
}
jac-cbi commented 1 year ago

Yaserde doesn’t care about XML element order? So it was a bad assumption on my part (probably cargo-culted from my implementation use serde-xml-rs), Ha! Thanks for the quick response. I’ll update the issue with the workaround

On Mar 29, 2023, at 11:58 AM, Marc-Antoine ARNAUD @.***> wrote:

Hello,

Idealy: stuffs: Vec, requires to be a flatten like

[yaserde(flatten)]

stuffs: Vec, but it's not implemented.

So the unique way today is to handle enum variants it the main object.

Here the sample code that allow you to parse the current XML:

[cfg(test)]

mod tests {

[test]

fn simple_reproduction_of_error() {

[derive(Debug, Default, yaserde_derive::YaDeserialize, PartialEq)]

struct A {
  #[yaserde(attribute)]
  id: u64,
  name: String,
}
#[derive(Debug, Default, yaserde_derive::YaDeserialize, PartialEq)]
struct B {
  #[yaserde(attribute)]
  id: u64,
  time: String,
}
#[derive(Debug, Default, yaserde_derive::YaDeserialize, PartialEq)]
struct C {
  #[yaserde(attribute)]
  id: u64,
  count: u64,
  when: B,
}

#[derive(Debug, Default, yaserde_derive::YaDeserialize, PartialEq)]
struct Base {
  #[yaserde(rename = "X")]
  x: bool,
  #[yaserde(rename = "Y")]
  y: String,
  #[yaserde(rename = "A")]
  a: Vec<A>,
  #[yaserde(rename = "B")]
  b: Vec<B>,
  #[yaserde(rename = "C")]
  c: Vec<C>,
}

let src = r#"
  <Base>
    <X>true</X>
    <Y>This is a string</Y>
    <C id="1">
      <count>11250</count>
      <when>
        <time>11:30</time>
      </when>
    </C>
    <B id="2">
      <time>0600</time>
    </B>
    <A id="3">
      <name>Bob</name>
    </A>
  </Base>
"#;

let base: Base = yaserde::de::from_str(&src).unwrap();
println!("{:?}", base);
assert_eq!(base.a.len(), 1);

} } — Reply to this email directly, view it on GitHub https://github.com/media-io/yaserde/issues/159#issuecomment-1488887187, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOXGGNW33VDZRKI7NR2KZSLW6RLZHANCNFSM6AAAAAAWHWZNIQ. You are receiving this because you authored the thread.

MarcAntoine-Arnaud commented 1 year ago

In that case no, the order is not a contraint

jac-cbi commented 1 year ago

@MarcAntoine-Arnaud

Just a note to close this out: It turns out I need to do Vec<enum> because I need to sort and iterate based on the common id element.

So, I removed the #[derive(... YaDeserialize, ..)] from struct Base (In my real code, this is a layer or two deep in the XML structure) and open coded impl YaDeserialize for Base { fn deserialize(...) }. There weren't many good examples of how to idiomatically do this, so here it goes...

If you see anything I should be doing differently, please let me know :-)

#[cfg(test)]
mod tests {
    use std::io::Read;
    use std::str::FromStr;

    use xml::reader::XmlEvent;
    use yaserde::de::from_str;
    use yaserde::YaDeserialize;
    use yaserde_derive::YaDeserialize;

    #[test]
    fn simple_reproduction_of_error() {
        #[derive(Debug, Default, YaDeserialize, PartialEq)]
        struct A {
            #[yaserde(attribute)]
            id: u64,
            name: String,
        }
        #[derive(Debug, Default, YaDeserialize, PartialEq)]
        struct B {
            #[yaserde(attribute)]
            id: u64,
            time: String,
        }
        #[derive(Debug, Default, YaDeserialize, PartialEq)]
        struct C {
            #[yaserde(attribute)]
            id: u64,
            count: u64,
            when: B,
        }

        #[derive(Debug, Default, YaDeserialize, PartialEq)]
        enum Stuff {
            A(A),
            B(B),
            C(C),
            #[default]
            E,
        }

        #[derive(Debug, Default, PartialEq)]
        struct Base {
            x: bool,
            y: String,
            stuffs: Vec<Stuff>,
        }

        impl YaDeserialize for Base {
            fn deserialize<R: Read>(reader: &mut yaserde::de::Deserializer<R>) -> Result<Self, String> {
                if let XmlEvent::StartElement { ref name, .. } = reader.next_event()? {
                    if name.local_name.as_str() != "Base" {
                        return Err(format!("StartElement.name({}) != \"Base\"", name.local_name.as_str()))
                    }
                } else {
                    return Err(format!("Expected StartElement"))
                }

                let mut base = Base::default();

                loop {
                    let event = reader.peek()?.to_owned();
                    match event {
                        XmlEvent::StartElement { ref name, .. } => {
                            match name.local_name.as_str() {
                                "X" => {
                                    reader.next_event()?;
                                    if let XmlEvent::Characters(text) = reader.peek()?.to_owned() {
                                        base.x = bool::from_str(&text).unwrap();
                                    }
                                },
                                "Y" => {
                                    reader.next_event()?;
                                    if let XmlEvent::Characters(text) = reader.peek()?.to_owned() {
                                        base.y = String::from(&text);
                                    }
                                },
                                "A" => {
                                    let req = <A as YaDeserialize>::deserialize(reader,)?;
                                    base.stuffs.push(Stuff::A(req));
                                },
                                "B" => {
                                    let req = <B as YaDeserialize>::deserialize(reader,)?;
                                    base.stuffs.push(Stuff::B(req));
                                },
                                "C" => {
                                    let req = <C as YaDeserialize>::deserialize(reader,)?;
                                    base.stuffs.push(Stuff::C(req));
                                },
                                _ => {
                                    return Err(format!("Unhandled name {}", name.local_name.as_str()))
                                }
                            }
                        },
                        XmlEvent::EndElement { ref name, .. } => {
                            if name.local_name.as_str() == "Base" {
                                break;
                            }
                        },
                        _ => {
                            return Err(format!("Unhandled XmlEvent {:?}", event))
                        }
                    }
                    reader.next_event()?;
                }

                Ok(base)
            }
        }

        let src = r#"
            <Base>
              <X>true</X>
              <Y>This is a string</Y>
              <C id="1">
                <count>11250</count>
                <when>
                  <time>11:30</time>
                </when>
              </C>
              <B id="2">
                <time>0600</time>
              </B>
              <A id="3">
                <name>Bob</name>
              </A>
            </Base>
        "#;

        let base: Base = from_str(&src).unwrap();
        assert_eq!(base.stuffs.len(), 3);
        assert_eq!(base.x, true);
        assert_eq!(base.y, "This is a string".to_string());
        for i in base.stuffs {
            match i {
                Stuff::A(a) => {
                    assert_eq!(a.id, 3);
                    assert_eq!(a.name, "Bob".to_string());
                },
                Stuff::B(b) => {
                    assert_eq!(b.id, 2);
                    assert_eq!(b.time, "0600".to_string());
                },
                Stuff::C(c) => {
                    assert_eq!(c.id, 1);
                    assert_eq!(c.count, 11250);
                    assert_eq!(c.when.time, "11:30".to_string());
                },
                Stuff::E => todo!(),
            }
        }
    }
}