Append to archive - Githubissues

pchilds commented 4 years ago

I currently use serialisation to save to file ids of successful jobs, so that in the event of a crash we can pick up where we left off. I had no problem in boost, but now moving to cereal it seems that, as far as I can tell, this isn't technically supported for the following reasons:

Multiple calls to *OutputArchive ctor will generate multiple XML headers or JSON root nodes.
A single call to *OutputArchive ctor will not be destructed in the event of a crash resulting in zero xml output (irrespective of .flush() calls on the stream) or a lack of the json curly brace.

It is interesting that https://uscilab.github.io/cereal/transition_from_boost.html mentions none of these gotchas.

A work around would be to each time deserialise saved items as an *InputArchive, append my successful job id, then serialise the lot back over it. This would; however, create a lot of overhead and not scale well at all for large job lists. It would be nice to have an append mode. This would be as simple as having an option to output xml headerless or to chomp the previous json closing brace when starting to write the appended object.

The best I can get so far is to use json in single call mode and add in a trailing brace if it is lacking it prior to reading.

AzothAmmo commented 4 years ago

Since the text archives (json/xml) are not guaranteed to dump their contents until the destructor fires, there indeed isn't a great way to do this with cereal . There's actually a feature request for streaming serialization support from way back when (https://github.com/USCiLab/cereal/issues/55).

You could probably get the behavior you want with a binary archive (as it serializes immediately), but note that in general cereal is only designed to work when saving or loading using a single input or output archive per stream.

It's also possible you may be able to modify either the json or xml archives to achieve your desired behavior, but I would strongly advise keeping to one ctor/dtor per data stream.

pchilds commented 4 years ago

I tried both the json work around and to use portable binary, but in both cases experienced bug #601 causing the loss of objects when deserialised.

In the end I had to keep the stream open and construct and destruct xmlOutputArchives each time I wanted to append something. Then when I load the file I have to delete all xml headers except the first one. I haven't checked yet if I need to combine the cereal node too. A very painful workaround.

USCiLab / cereal

Append to archive #598