Open fred777 opened 1 year ago
Hi @fred777, good to hear that you are using serialize19. Serialize19 was designed to be minimal and extendable. So we only pay for what we actually use. Only the data you serialize is stored in the binary.
We encode the version of the data schema in a header and use the correct serialization method for that version. We basically hash our data layout. But this is far too complicated for this library.
You may store a version indicator in your format and decide which parts to load. Example:
struct Format {
int a{};
int b{};
int c{};
std::string s{};
};
template<class A>
void serialize(A& a, Format& x)
{
uint8_t version = 2;
serialize(a, version);
serialize(a, x.a);
serialize(a, x.b);
if (version == 2) serialize(a, x.c);
serialize(a, x.s);
}
I would also recommend to wrap serialize19 to tweak it better for your use case.
For example use an operator instead of calling serialize
manually.
Regarding the exceptions. We try to avoid them. serialize
assumes that the buffer has enough capacity to read or write the data.
If you cannot ensure that you can wrap the basic archive types and throw your exception if the size is not enough.
The ReadArchive
has two interesting methods you might want to guard.
withPrimitive
is mandatory to guard. SliceReader
would use a reinterpret_cast beyond the buffer.withSlice
will return a slice smaller than the requested n
if not enough data is available.While reading the code again, I guess we should add more documenting comments to this sublibrary.
Hi @arBmind, thank you for the instant reply!
Yes, versioning is one mitigation I was thinking about. But this won't protect me from crashes caused by file corruption of course.
Another solution that just came to my mind could be similar to the SizeArchive: how about an archive reader that just probes the buffer to check if there is enough bytes allocated in it - lets call it SizeVerificationArchive
Then it's up to the user to use that SizeVerificationArchive prior to ReadArchive or not.
Hi @fred777,
Your SizeVerificationArchive
is not a bad idea. My gut feeling says that this is not enough and therefore kind of dangerous. Reading files or network traffic is like reading user input. It can be broken or even come from an attacker.
An serialization library cannot protect against this. We simple have not enough domain knowledge.
I have an extension to serialize19
in one project that looks like this:
enum class ReadResult {
Success,
VersionMismatch, // eg. unknown version
LengthMismatch, // eg. serialized size != expected size
SizeMismatch, // eg. buffer has not enough or too much data
ConstraintViolation, // eg. enum value out of range
};
struct ReadArchive : serialize19::ReadArchive<endianBehaviour()>{
ReadResult result{};
};
I hope that inspires you to go further.
I would appreciate and accept a pull request for a dedicated SizeVerificationArchive
.
Pull request - maybe later - I just found a solution that works for my purposes: enable SEH exceptions in VC :) Now I'm catching either a std::bad_alloc or something generic via catch(...), depending on the type of buffer corruption.
But regarding user input / attacks, the proposed buffer size checking would at least provide a first line of defence to prevent application crashes.
At a second stage, validity of all data members needs to be verified of course.
I'm using serialize19 to create and load binary files of structured data.
But if the file format changes, e.g. because data members have been added / removed, or if the file gets corrupted in some way, the archive reader might cause a read access exception which can't be caught without messing with signal handlers.
=> is there a clean and reliable way to detect and handle this condition without application crash?