tafia / quick-xml

Rust high performance xml reader and writer
MIT License
1.17k stars 231 forks source link

Deserializing to variant vector fields fails #288

Open MoSal opened 3 years ago

MoSal commented 3 years ago

Example with tagged and untagged enums showing different errors. The tagged error seems to point to the source of the issue.

use serde::Deserialize;

fn main() {
    // struct

    let bs = br##"<Xs><st><v s="some_s"/></st></Xs>"##;

    // works as expected with 0, 1, or more `v` elements
    let xs: Xs = quick_xml::de::from_reader(&bs[..]).unwrap();
    eprintln!("{:#?}", xs);

    // tagged enum

    // no v, works
    let bn = br##"
        <XEnumWithTag>
          <en type="V">
          </en>
        </XEnumWithTag>
    "##;
    let xn: XEnumWithTag = quick_xml::de::from_reader(&bn[..]).unwrap();
    eprintln!("{:#?}", xn);

    // 1 v or more, fails with: "invalid type: map, expected a sequence"
    let bn = br##"
        <XEnumWithTag>
          <en type="V">
            <v s="some_s"/>
          </en>
        </XEnumWithTag>
    "##;
    let xn_res: Result<XEnumWithTag, _> = quick_xml::de::from_reader(&bn[..]);
    match xn_res {
        Ok(xn) => eprintln!("{:#?}", xn),
        Err(e)   => eprintln!("XEnumWithTag failed to deserialize: {:?}", e),
    }

    // same story with untagged, just different error

    // no v, works
    let bn = br##"
        <XEnumUntagged>
          <en>
          </en>
        </XEnumUntagged>
    "##;
    let xn: XEnumUntagged = quick_xml::de::from_reader(&bn[..]).unwrap();
    eprintln!("{:#?}", xn);

    // 1 v or more, fails with: "data did not match any variant of untagged enum EnumUntagged"
    let bn = br##"
        <XEnumUntagged>
          <en>
            <v s="some_s"/>
          </en>
        </XEnumUntagged>
    "##;
    let xn_res: Result<XEnumUntagged, _> = quick_xml::de::from_reader(&bn[..]);
    match xn_res {
        Ok(xn) => eprintln!("{:#?}", xn),
        Err(e)   => eprintln!("XEnumUntagged failed to deserialize: {:?}", e),
    }
}

#[derive(Deserialize, Debug)]
struct SWrap {
    s: String,
}

#[derive(Deserialize, Debug)]
#[serde(tag="type")]
enum EnumWithTag {
    S{ s: String },
    V{
        //v: Option<SWrap>, // works
        //v: Vec<SWrap>, // fails
        v: Option<Vec<SWrap>>, // fails if not None
    },
}

#[derive(Deserialize, Debug)]
#[serde(untagged)]
enum EnumUntagged {
    S{ s: String },
    V{
        v: Option<Vec<SWrap>>, // fails if not None
    },
}

#[derive(Deserialize, Debug)]
struct St {
    v: Option<Vec<SWrap>>, // works
}

#[derive(Deserialize, Debug)]
#[serde(deny_unknown_fields)]
pub struct Xs {
    st: Option<St>,
}

#[derive(Deserialize, Debug)]
#[serde(deny_unknown_fields)]
pub struct XEnumWithTag {
    en: Option<EnumWithTag>,
}

#[derive(Deserialize, Debug)]
#[serde(deny_unknown_fields)]
pub struct XEnumUntagged {
    en: Option<EnumUntagged>,
}
cpick commented 3 years ago

I hit this error as well. I added a test case (de::tests::enum_::internally_tagged::collection_struct::attributes) in the vec-invariant branch on my fork.

It fails with:

---- de::tests::enum_::internally_tagged::collection_struct::attributes stdout ----
thread 'de::tests::enum_::internally_tagged::collection_struct::attributes' panicked at 'called `Result::unwrap()` on an `Err` value: Custom("invalid type: map, expected a sequence")', src/de/mod.rs:1184:26
Mingun commented 2 years ago

I correctly understand, that #387 addresses those problems? If not, please try to reduce your example and turn it into the Rust testcase

albx79 commented 1 month ago

I have 0.36.1 and I'm still hitting this error. I have this example:


use serde_derive::Deserialize;

#[test]
fn show_bug() {
    let xml = r###"
    <?xml version='1.0' encoding='UTF-8'?>
    <Root>
        <Entry type="Foo">
            <Datum value="asdf"/>
        </Entry>
        <Entry type="Bar">
            <Datum value="qwer"/>
            <Datum value="zxcv"/>
        </Entry>
    </Root>
    "###;

    #[derive(Debug, Deserialize)]
    struct Root {
        #[serde(rename = "Entry")]
        entries: Vec<Entry>,
    }

    #[derive(Debug, Deserialize)]
    #[serde(tag = "@type")]
    enum Entry {
        Foo {
            #[serde(rename = "Datum")]
            datum: Datum,
        },
        Bar {
            #[serde(rename = "Datum")]
            data: Vec<Datum>
        }
    }

    #[derive(Debug, Deserialize)]
    struct Datum {
        #[serde(rename = "@value")]
        value: String,
    }

    let root: Root = quick_xml::de::from_str(xml).unwrap();
}

which fails with invalid type: map, expected a sequence

Mingun commented 1 month ago

@albx79, this is because bufferisation step that internally tagged enums introduces. During bufferisation content of

<Entry type="Bar">
    <Datum value="qwer"/>
    <Datum value="zxcv"/>
</Entry>

read into private serde's type Content. Because it is designed to represent any data, it requests deserialization using deserialize_any. However, in XML model only two types exists which are a string and a map. When deserialize_any is requested, that model is returned, in that case it would be:

Map {
  @type = String(Bar),
  Datum = Map {
    @value = String(qwer),
  },
  Datum = Map {
    @value = String(zxcv),
  },
}

It is stored buffered as Content::Map which can respond only to deserialize_map requests by calling Visitor::visit_map: https://github.com/serde-rs/serde/blob/e08c5de5dd62571a5687f77d99609f9d9e60784e/serde/src/private/de.rs#L1155

However, we try to deserialize a data field from Content::Map which is of type Vec<_> which requests deserialize_seq. Content::Map in that case returns an error "invalid type: map, expected a sequence": https://github.com/serde-rs/serde/blob/e08c5de5dd62571a5687f77d99609f9d9e60784e/serde/src/private/de.rs#L1368-L1376.

This situation could be improved in two ways: