media-io / yaserde

Yet Another Serializer/Deserializer
MIT License
174 stars 58 forks source link

Support for "raw" fields/contents? #181

Closed akavel closed 3 months ago

akavel commented 3 months ago

I have some complex XML that I want to deserialize and serialize fully and transparently, but only care to parse a few elements and fields. Do you think it could be acceptable for the library to have an attribute (maybe "raw"?) marking a field to store unknown unprocessed XML verbatim during deserialization, and then to write the contents back verbatim during serialization? (My idea is for something like what is described as ",innerxml" in Go's encoding/xml package's Unmarshal and Marshal function docs.)

If yes, would you have some suggestions/guidance where/how I could try to start implementing it if I wanted to try my luck at a PR for this?

edit: Hm; in meantime, I guess I could try for my immediate purpose to hack something up based on the example of writing a custom impl. for YaDeserialize and YaSerialize, right?

MarcAntoine-Arnaud commented 3 months ago

Hi @akavel,

As it's not standard to XML, this crate will not support it. But you can handle it by implementing the serializer and deserialiser by your own to to what you want.

Is it fine ? You have some exemples in the repository of custom trait implementation.

Best, Marc-Antoine

akavel commented 3 months ago

Thanks for the reply @MarcAntoine-Arnaud !

So, I ended up implementing a custom element indeed, that seems to work good enough for my particular case. I based it on the code in: https://github.com/media-io/yaserde/issues/87 - basically:

use xml::reader::XmlEvent;
use yaserde::{YaDeserialize, YaSerialize};

pub struct OpaqueXml {
    events: Vec<XmlEvent>,
}

impl YaDeserialize for OpaqueXml {
    fn deserialize<R>(reader: &mut yaserde::de::Deserializer<R>) -> Result<Self, String>
    where
        R: std::io::Read,
    {
        use xml::reader::XmlEvent::*;
        let mut events = vec![];
        let start_depth = reader.depth();
        loop {
            let depth = reader.depth();
            let peek = reader.peek()?;
            match peek {
                EndElement{..} if depth == start_depth+1 => {
                    events.push(peek.clone());
                    break;
                },
                _ => {},
            }
            events.push(reader.next_event()?);
        }
        Ok(Self{events})
    }
}

impl YaSerialize for OpaqueXml {
    fn serialize<W>(&self, writer: &mut yaserde::ser::Serializer<W>) -> Result<(), String>
    where
        W: std::io::Write,
    {
        for ev in &self.events {
            writer.write(ev.as_writer_event().unwrap()).unwrap();
        }
        Ok(())
    }

    fn serialize_attributes(
        &self,
        source_attributes: Vec<xml::attribute::OwnedAttribute>,
        source_namespace: xml::namespace::Namespace,
    ) -> Result<
        (
            Vec<xml::attribute::OwnedAttribute>,
            xml::namespace::Namespace,
        ),
        String,
    > {
        Err("OpaqueXml cannot be serialized as attribute".to_string())
=    }
}

One thing in that, is that I found the traits underdocumented. The example above helped me, but I still had to do some stuff try-and-error style, like discovering that I must not consume the EndElement node, only can peek it. I still don't really know what serialize_attributes() is intended to mean (though I have some suspicions), and don't understand its contract. It could be really nice if you might consider adding some docs for the traits that could help guide how to implement them ❤ But still - thank you for the library anyway, and thank you for building it in a way that allowed me to implement this extension!

I'm fine with you closing this ticket as such. Thanks!

MarcAntoine-Arnaud commented 3 months ago

Thank you, I have created a related issue to improve documentation. So I can close that issue.