mitsuhiko / deser

Experimental rust serialization library
https://docs.rs/deser
Apache License 2.0
287 stars 8 forks source link

Remove MapSink and SeqSink #14

Closed mitsuhiko closed 2 years ago

mitsuhiko commented 2 years ago

This is a followup to #6. At the moment MapSink and SeqSink primarily exist because before the introduction of the SinkHandle it was impossible for a Sink to hold state. The solution inherited from miniserde was to create a map/seq sink when map/seq was called which in turn creates the boxed allocation and writes into the slot on finalization.

Now that we can hold a boxed sink directly in the SinkHandle this indirection is not helpful any more. More than that, this indirection causes one new challenge which is that MapSink has its lifetime bound to the outer Sink which makes it hard to compose deserializers. For instance for flattening (#9) it would be nice to be able to inquire another sink about if its interested in a key. This basically requires that a sink also starts another sink so it can drive that one as well. With the current MapSink indirection this is not possible due to lifetimes.

So the plan could be to inline the logic for the map and seq sinks directly onto the main sink. Example for BTreeMap:

impl<K, V> Deserialize for BTreeMap<K, V>
where
    K: Ord + Deserialize,
    V: Deserialize,
{
    fn deserialize_into(out: &mut Option<Self>) -> SinkHandle {
        struct MapSink<'a, K: 'a, V: 'a> {
            slot: &'a mut Option<BTreeMap<K, V>>,
            map: BTreeMap<K, V>,
            key: Option<K>,
            value: Option<V>,
        }

        impl<'a, K, V> MapSink<'a, K, V>
        where
            K: Ord,
        {
            fn flush(&mut self) {
                if let (Some(key), Some(value)) = (self.key.take(), self.value.take()) {
                    self.map.insert(key, value);
                }
            }
        }

        impl<'a, K, V> Sink for MapSink<'a, K, V>
        where
            K: Ord + Deserialize,
            V: Deserialize,
        {
            fn map(&mut self, _state: &DeserializerState) -> Result<(), Error> {
                Ok(())
            }

            fn key(&mut self) -> Result<SinkHandle, Error> {
                self.flush();
                Ok(Deserialize::deserialize_into(&mut self.key))
            }

            fn value(&mut self) -> Result<SinkHandle, Error> {
                Ok(Deserialize::deserialize_into(&mut self.value))
            }

            fn finish(&mut self, _state: &DeserializerState) -> Result<(), Error> {
                self.flush();
                *self.slot = Some(take(&mut self.map));
                Ok(())
            }
        }

        SinkHandle::boxed(MapSink {
            slot: out,
            map: BTreeMap::new(),
            key: None,
            value: None,
        })
    }
}

For structs one could then introduce a value_for_key method which could be reached through to implement flattening. In the following example the StructSink holds an inner struct which should be flattened:

impl<'a, K, V> Sink for StructSink<'a, K, V>
where
    K: Ord + Deserialize,
    V: Deserialize,
{
    fn map(&mut self, _state: &DeserializerState) -> Result<(), Error> {
        Ok(())
    }

    fn key(&mut self) -> Result<SinkHandle, Error> {
        self.flush();
        Ok(Deserialize::deserialize_into(&mut self.key))
    }

    fn value(&mut self) -> Result<SinkHandle, Error> {
        let key = self.key.unwrap();
        if let Some(sink) = self.value_for_key(&key) {
            Ok(sink)
        } else {
            Ok(SinkHandle::null())
        }
    }

    fn value_for_key(&mut self, key: &str) -> Option<SinkHandle> {
        match key {
            "x" => Some(Deserialize::deserialize_into(&mut self.x)),
            "y" => Some(Deserialize::deserialize_into(&mut self.y)),
            other => self.nested.value_for_key(other),
        }
    }

    fn finish(&mut self, _state: &DeserializerState) -> Result<(), Error> {
        self.flush();
        *self.slot = Some(take(&mut self.map));
        Ok(())
    }
}
mitsuhiko commented 2 years ago

Interestingly this also brings up the point that the descriptor will have to move from the seq/map sink into the main sink. One consequence of this is that expecting could go away it becomes possible to use the type name from the descriptor directly.