microsoft / bond

Bond is a cross-platform framework for working with schematized data. It supports cross-language de/serialization and powerful generic mechanisms for efficiently manipulating data. Bond is broadly used at Microsoft in high scale services.
MIT License
2.61k stars 321 forks source link

[C++] How do we deserialize two continuous objects from one bulk of memory? #1160

Closed SleepyBag closed 2 years ago

SleepyBag commented 2 years ago

Say I have two bond structs lying continuously in a bulk of memory and I want to read them both. I make an CompactBinaryReader from the blob of the memory.

bond::CompactBinaryReader<bond::InputBuffer> reader(buffer);
bond::Deserialize(reader, firstObject);

I can successfully deserialize the first struct from the memory using the snippet above.

But, how can I deserialize the second? It turns out that after the call of Deserialize, the pointer of the InputBuffer in reader is still at the beginning of the memory. This is because Deserialize takes a copy of the reader parameter:

template <typename Protocols = BuiltInProtocols, typename Reader, typename T>
inline void Deserialize(Reader input, T& obj)
{
    Apply<Protocols>(To<T, Protocols>(obj), bonded<T, Reader&>(input));
}

And the constructor of CompactBinaryReader also takes a copy of the InputBuffer parameter:

    CompactBinaryReader(typename boost::call_traits<Buffer>::param_type input,
                        uint16_t version_value = default_version<CompactBinaryReader>::value)
        : _input(input),
          _version(version_value)
    {
        BOOST_ASSERT(protocol_has_multiple_versions<CompactBinaryReader>::value
            ? _version <= CompactBinaryReader::version
            : _version == default_version<CompactBinaryReader>::value);
    }

So, seems there's no way that we can deserialize an object and continue from where it ends - the buffer point is copied and will be destroyed once Deserialze returns. And it seems there is no way to pass a reference of either InputBuffer or CompactBinaryReader to the function call. Is this the expected behavior, or is it a mistake? If it is expected, what is the right pattern to deserialize two continuous objects from the memory?

chwarr commented 2 years ago

Does passing the input Reader wrapped in a std::reference_wrapper achieve what you want? E.g.,

#include <functional>

bond::CompactBinaryReader<bond::InputBuffer> reader(buffer);
bond::Deserialize(std::ref(reader), firstObject);

(It's been a while since I've worked with C++ and my memory of how template type resolution works is hazy on the details...)

If you have control over the stream that you "pack" multiple Bond objects into, you could add a length prefix before each Bond payload. This would give you other benefits, like being able to skip a payload or--possibly--seek over a corrupt payload. With this information, you could seek in input buffer or construct one for each payload as needed.

<little-endian-uint32-size-of-next-Bond-payload><Bond-marshalled-data><little-endian-uint32-size-of-next-Bond-payload><Bond-marshalled-data>
chwarr commented 2 years ago

I believe this has been answered, so I'm going to close this. If not, feel free to add another comment with your follow up questions.