ebarnard / rust-plist

A rusty plist parser.
MIT License
71 stars 42 forks source link

plist::from_bytes fails to parse Data(Vec<u8>) #87

Closed smndtrl closed 1 year ago

smndtrl commented 1 year ago

The following is a minimal repro when de-serializing data into Vec

    #[derive(Serialize, Deserialize)]
    struct Data {
        blob: Vec<u8>
    }

    #[test]
    fn plist_data() {
        let blob = r#"<?xml version="1.0" encoding="UTF-8"?>
        <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
        <plist version="1.0">
        <dict>
            <key>blob</key>
            <data>cOuLPYZefSXGaXnydphRNdj60lLnNkrOez7w45KOJVY=</data>
        </dict>
        </plist>"#;

        let data: Result<Data, plist::Error> = plist::from_bytes(blob.as_bytes());

        match data {
            Ok(d) => {
                println!("d");
            },

            Err(e) => {
                panic!("Error: {}", e)
            },
        }
    }

It fails with

thread 'extractors::plist::tests::plist_data' panicked at 'Error: Serde("invalid type: byte array, expected a sequence")', src/extractors/plist.rs:262:17

I might have missed something in the examples but I haven't found something related to Data de-serialization.

For now I'm using the follow workaround

#[derive(Serialize, Deserialize)]
struct Data2 {
  blob: plist::Value,
}

let data: Result<Data2, plist::Error> = plist::from_bytes(blob.as_bytes());

let plist::Value::Data(inner_data) = d.unwrap().blob else  {
  panic!("Failed")
};

Unfortunately I'm not versed enough in rust to make out the root cause of the initial error.

zummenix commented 1 year ago

I was looking into it a little bit today. I think the problem is that Vec<u8> in the provided example is treated as a plist's array of integers. I'm not sure if it's intentional or just an omission. In any case it would be nice to have a dedicated type in the library to hold plist's data and implement proper serialize/deserialize for it.

Meanwhile, as a second workaround one can implement serde's Deserialize yourself. Turned out it's relatively easy :). Here is how.

Let's introduce our new type that represents a plist's data:

struct Data(Vec<u8>);

To put it in the context of the example, we would use it like this:

#[derive(Deserialize)]
struct Struct {
    blob: Data,
}

Next, we need a visitor implementation for serde:

struct DataVisitor;

impl<'de> Visitor<'de> for DataVisitor {
    type Value = Data;

    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str("a byte array")
    }

    fn visit_bytes<E>(self, v: &[u8]) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        self.visit_byte_buf(v.to_owned())
    }

    fn visit_byte_buf<E>(self, v: Vec<u8>) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(Data(v))
    }
}

Finally, we are implementing Deserialize for Data:

impl<'de> Deserialize<'de> for Data {
    fn deserialize<D>(deserializer: D) -> Result<Data, D::Error>
    where
        D: Deserializer<'de>,
    {
        deserializer.deserialize_byte_buf(DataVisitor)
    }
}

Implementing Serialize is left as an exercise for the reader ;)

ebarnard commented 1 year ago

I'm not sure if it's intentional or just an omission.

Definitely an omission. I'm surprised calling visit_byte_buf on a Vec<u8>'s visitor returns an error though.

In any case it would be nice to have a dedicated type in the library to hold plist's data and implement proper serialize/deserialize for it.

It looks like we're going to have to have one.

ebarnard commented 1 year ago

For the moment, this case should be fixable by ensuring we call visit_seq on any Visitor where the T: Deserialize has called deserialize_seq in the case of Event::Data.