rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.72k stars 12.64k forks source link

Serializing map types #4115

Closed erickt closed 11 years ago

erickt commented 11 years ago

std::serialization does not directly support serializing a map structure like LinearMap. Should we support it as a top-level construct? Records and structs are similar, but this map would allow for serializing non-string keys.

nikomatsakis commented 11 years ago

This seems like a good idea to me. I presume what you mean is to add methods to the Serializer interface for maps, even though #[auto_serialize] would never generate code that calls them?

nikomatsakis commented 11 years ago

Thinking about this a bit more: one tricky issue that comes to mind is how to manage deserialization. I guess it'd be up to the deserializer to select an appropriate map implementation? Maybe we can supply a hint as part of the serialization as to what kind of map this originally was? Maybe we want a base serializer/deserializer interface that is more-or-less what we have now, and then some extended variants?

erickt commented 11 years ago

An example of how this would be useful is how do we encode a map to and from json? Say we wanted to write this:

fn main() {
    let map = LinearMap();
    map.insert(~"a", 1);
    map.insert(~"b", 2);
    map.insert(~"c", 3);
    map.serialize(&json::Serializer(io::stdout()));
}

We could make a serializer like this:

impl<S, V: Serializable> LinearMap<~str, V>: Serializable<S> {
    fn serialize(&self, s: &S) {
        let mut i = 0;
        do s.emit_rec {
            for self.each |key, value| {
                s.emit_field(key, i, || value.serialize(s));
                i += 1;
            }
        }
    }
}

However, serializers support maps with non-string keys, like MessagePack (http://msgpack.org/). So to support that, we would have to write our impl to emit a vector of tuples, like this:

impl<S, K: Serializable, V: Serializable> LinearMap<K, V>: Serializable<S> {
    fn serialize(&self, s: &S) {
        let mut i = 0;
        do s.emit_owned_vec(self.len()) {
            for self.each |key, value| {
                do s.emit_vec_elt(i) {
                    do s.emit_tup(2) {
                        s.emit_tup_elt(0, || key.serialize(s));
                        s.emit_tup_elt(1, || value.serialize(s));
                    }
                }
                i += 1;
            }
        }
    }
}

It would take some effort to write serializers to infer that a vector of 2-tuples is actually a map, and generate a proper native map. Adding std::serialization could really simplify that code, and make serialization even more broadly applicable.

erickt commented 11 years ago

@nikomatsakis: Good point. Maybe we just provide deserializers for concrete types?

erickt commented 11 years ago

@nikomatsakis: Yes, these Serializer::emit_map and etc methods would not be used by auto_serialize, just for types that opt into using them.

erickt commented 11 years ago

Ironically, this doesn't necessarily help out serializing maps to json. If we do support non-string-key maps, then if you want to serialize LinearMap<int, int> to json, either you:

  1. fail on the first non-string keys
  2. detect that the keys are non-strings and treat the map as [(int, int)]
  3. always emit maps as [(K, V)] types

I'd love to have trait specialization for this case, but I'm sure that adds a whole host of other problems.

erickt commented 11 years ago

(Responding to myself again) I do remember some talk of making a StrMap<V> type that has an interface better oriented toward working with strings. If we had that, then it'd be easy to serialize StrMap<V> to a json map.

catamorphism commented 11 years ago

Added "far future" milestone

nikomatsakis commented 11 years ago

(Deserialization, I realize now, is a non-issue: when deserializing, you always have the type you are deserializing into.)

catamorphism commented 11 years ago

Revisiting for bug triage. I agree with my earlier self.

alexcrichton commented 11 years ago

Currently there's a serialization/deserialization for HashMap/HashSet/TreeMap/TrieMap. Also, there's a function in the Encoder interface for emit_map/emit_map_elt_key/emit_map_elt_value.

I think that this satisfies what the bug was originally for, but feel free to reopen if it's intended for something else!