camgunz / cmp

An implementation of the MessagePack serialization format in C / msgpack.org[C]
http://msgpack.org
MIT License
340 stars 78 forks source link

Added skipping support #38

Closed m-wichmann closed 7 years ago

m-wichmann commented 7 years ago

This is another attempt to add skipping support (see also #5). This implementation is based on code from CWPack and does not use recursion.

Before this is merged, two things should be considered:

  1. obj_skip_count is currently a uint32_t and could overflow (e.g. while reading malicious data). Extending the type to uint64_t would help with the problem (at least a bit), but I didn't really want to change it, since I use cmp on an embedded device. Maybe instead, we could check for overflow and exit with an error.
  2. Skipping of strings/bin/ext currently reads one byte at a time, since there is no buffer available. This could be replaced with a skip/seek function pointer. Instead a buffer of configurable size could be added (see #3).
antoinealb commented 7 years ago

Nice to see that you added the error checking!

For your first point, you could define obj_skip_count as size_t: http://stackoverflow.com/questions/2550774/what-is-size-t-in-c : size_t is a type guaranteed to hold any array index.

m-wichmann commented 7 years ago

For your first point, you could define obj_skip_count as size_t

I thought about that, but size_t is only guaranteed to be at least 16 bit. Although I don't know of any platform that uses less than 32 bits, it might be a problem for some obscure architectures or compilers. MsgPack allows for arrays and maps to contain up to 2^32 entries (2^33 if you count the keys/values in a map separately). And nested arrays/maps also increment the obj_skip_count. So a single array with 2^32 entries, containing an array with 20 elements will result in an overflow, although it should be valid data. I'm not sure if such a case has to be handled correctly (rather unrealistic data) or if it could be used maliciously.

antoinealb commented 7 years ago

It is also guaranteed to be able to represent any sizeof() result, therefore it will be at least 32 bit on 32 bit machines and 64 bits on 64 bits machines.

camgunz commented 7 years ago

Hey, thank you so much :) :) :+1:

I'll try and dedicate some time to this tomorrow, sorry I've been slacking.

camgunz commented 7 years ago

I've started working on a different approach to skipping, functions starting here. Let me know what you think.

camgunz commented 7 years ago

Closing this; different implementation has been merged into master.