vgteam / libbdsg

Optimized sequence graph implementations for graph genomics
MIT License
32 stars 6 forks source link

Add magic numbers to the serialize/deserialize methods #24

Closed adamnovak closed 4 years ago

adamnovak commented 5 years ago

It would be handy to be able to identify the files' types. Serialize would prefix each file with some constant bytes for each graph type, and deserialize would skip over them.

We could also maybe add a method to SerializableHandleGraph that would return the magic number that the class uses, to let larger programs sniff types auto-magically.

VG's IO system now has support for ID-ing non-VPKG files by magic number, so this would help compatibility with vg.

jeizenga commented 5 years ago

We could use one of those _impl schemes for serialize/deserialize to ensure that each implementation prefixes a serialization with its magic number. As in, have a non-virtual serialize and deserializethat take care of the magic number and the virtual serialize_impl/deserialize_impl

jeizenga commented 5 years ago

Proposal:

public:

virtual static uint64_t get_magic_number();

private:

virtual void serialize_impl(ostream& out) const;

virtual void deserialize_impl(istream& in);

public:

void serialize(ostream& out) const {
    out << htonl(get_magic_number());
    serialize_impl(out);
}

void deserialize(istream& in) {
    uint64_t magic_number;
    in.read((char*) &magic_number, sizeof(magic_number) / sizeof(char));
    magic_number = ntohl(magic_number);
    assert(magic_number == get_magic_number());
    deserialize_impl(in);
}