blob deserialization broken

smatting commented 2 years ago

Messages containing blob atoms will break all following atoms. I think this is because the deserializtion doesn't take the padding bytes into account.

How to reproduce:

Create a bundle
Add message with address /a
Append a blob of length 3 with data 01 02 03 (hex)
Append another blob of length 257 with data 99 99 ...

This serializes to

0000:   23 62 75 6e  64 6c 65 00  00 00 00 00  00 00 00 00   #bundle.........
0010:   00 00 01 18  2f 61 00 00  2c 62 62 00  00 00 00 03   ..../a..,bb.....
0020:   01 02 03 00  00 00 01 01  99 99 99 99  99 99 99 99   ................
                                      and so on   ...

Notice how the first blob is padded by a byte 00 to make the overall seralized length a multiple of 4: 4 bytes for the length 00 00 00 03 and 4 bytes for the padded data 01 02 03 00.

This deserializes (incorrectly) to a bundle with message /a containing

A blob of length 3 with data 01 02 03
A blob of length 1 with data 01

It seems the deserialization function dosn't take take padding byte into account when looking for the start of the next atom. When it attempts to read the next blob it reads the length from these four bytes.

01 02 03 00  00 00 01 01
         \----------/

so it arrives at length 00 00 00 01 (=1) and then reads the data from the following byte 01.

You can verify this by playing with size of the second blob. I tested this on master on commit 9315cc5f9aba199af66f07fd2f66ddfc0b96bae7 however I don't have any c code to reproduce this because I'm binding libo to another language.

maccallum commented 2 years ago

Thanks for catching that, and for such a helpful report! I've just pushed a fix to master.

smatting commented 2 years ago

Great! Thank you for fixing this so quickly!

CNMAT / CNMAT-odot

blob deserialization broken #430