hit9 / bitproto

The bit level data interchange format for serializing data structures (long term maintenance).
https://bitproto.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
127 stars 16 forks source link

Ambiguity about the functionality of Encode/Decode #50

Closed zzz00yx closed 4 months ago

zzz00yx commented 1 year ago

For byte-aligned field, Encode/Decode overwrite the dest. For non-byte-aligned field, Encode/Decode is just a BitOR

Message definition:

message Fipr {
    uint1 api_type = 1
    uint5 engine_id = 2
    uint13 pre_install_data = 3
}
message Flit {
    uint32 dw0 = 1
    uint32 dw1 = 2
    uint32 dw2 = 3
    uint32 dw3 = 4
}

Test code:

TEST(BitProto, 0) {
    Fipr fipr{};
    uint8_t buf[256] = {};

    fipr.engine_id = 1;
    EncodeFipr(&fipr, buf);
    fipr.engine_id = 0;
    DecodeFipr(&fipr, buf);
    EXPECT_EQ(fipr.engine_id, 1);

    fipr.engine_id = 2;
    DecodeFipr(&fipr, buf);
    EXPECT_EQ(fipr.engine_id, 1); // FAILED, It's 3

    fipr.engine_id = 8;
    EncodeFipr(&fipr, buf);
    fipr.engine_id = 0;
    DecodeFipr(&fipr, buf);
    EXPECT_EQ(fipr.engine_id, 8); // FAILED, It's 9
}

TEST(BitProto, 1) {
    Flit flit{};
    uint8_t buf[256] = {};

    flit.dw0 = 1;
    EncodeFlit(&flit, buf);
    flit.dw0 = 0;
    DecodeFlit(&flit, buf);
    EXPECT_EQ(flit.dw0, 1);

    flit.dw0 = 2;
    EncodeFlit(&flit, buf);
    flit.dw0 = 0;
    DecodeFlit(&flit, buf);
    EXPECT_EQ(flit.dw0, 2);

    flit.dw0 = 4;
    EncodeFlit(&flit, buf);
    flit.dw0 = 0;
    DecodeFlit(&flit, buf);
    EXPECT_EQ(flit.dw0, 4);
}

The implementation of BpCopyBufferBits in lib/c/bitproto.c make this happen.

What do you think? Is this ambiguity reasonable?

hit9 commented 1 year ago

Hi, sorry for this late response.

The target struct's memory should be cleared before DecodeXXX function is called.

bitproto only touches the target bits, other bits are left unchanged. So, for the not byte-aligned fields, when the target struct is dirty with some pre-filled data there, DecodeXXX goses unexpected results.

For an example, here's an uint8_t integer in C, a = 11, its binary representation in 8bits width is (lower bits on the right)

00001011

Say its integer type in bitproto definition is uint3. So what if we decode a buffer 00000001 onto it?

Only the lowest 3bits are taken care of, the other bits aren't, they're left unchanged, the result is:

00001001

Thus, the result is unexpected if we decode a buffer into a dirty struct.

zzz00yx commented 1 year ago

Hi, sorry for this late response.

The target struct's memory should be cleared before DecodeXXX function is called.

bitproto only touches the target bits, other bits are left unchanged. So, for the not byte-aligned fields, when the target struct is dirty with some pre-filled data there, DecodeXXX goses unexpected results.

For an example, here's an uint8_t integer in C, a = 11, its binary representation in 8bits width is (lower bits on the right)

00001011

Say its integer type in bitproto definition is uint3. So what if we decode a buffer 00000001 onto it?

Only the lowest 3bits are taken care of, the other bits aren't, they're left unchanged, the result is:

00001001

Thus, the result is unexpected if we decode a buffer into a dirty struct.

Emmm. If that's what you say, then it's not a problem. This is me reproducing the case you described. Take a look and see if it conveys the meaning you intended.

// bitproto
message A {
    uint3 a = 1
}

// c
void main() {
    struct A a = {};
    a.a = 0b00001011;
    uint8_t s1[1] = {0b00000001};
    DecodeA(&a, s1);
    assert(a.a == 0b00001001); // Failed, a.a is 0b00001011
}
hit9 commented 1 year ago

Aha, I know what you mean !

Thanks for the code.

I make a simple test, that what I have said is not the truth.

proto test

message A {
    uint3 a = 1
}
// test.c
#include <assert.h>

#include "test_bp.h"

int main() {
    struct A a = {.a = 0b00001011};
    unsigned char s[] = {0b00000001};
    DecodeA(&a, s);

    assert(a.a == 0b00001001);  // Won't pass.....
    assert(a.a == 0b00001011);  // Pass for stanardard mode
    assert(a.a == 0b00000001); // Pass for optimization mode.
    return 0;
}

Hmm, for the optimization mode, the generated code is simply:

int DecodeA(struct A *m, unsigned char *s) {
    ((unsigned char *)&((*m).a))[0] = (s[0] ) & 7;
    return 0;
}

The target field a.a's value is replaced with a byte s[0] & 7, so the result goes to be 0b00000001.

For the standard mode, only the low 3bits of s[0] is "copied" onto field a.a, where the "copy" is bit or operation |. So the result goes to be 0b00001011, and other left 5bits are left untouched.

Both of the 2 cases are unexpected:

  1. the optimization mode: the high 5 bits is replaced, cleared away.
  2. the standard mode: the dirty bits were not cleared, it's or | operation indeed. Non-zero bits affects "copy" result.

What the expecting result 0b00001 001 should be: the high 5 bits should left untouched, and the low 3bits are replaced with s[0] & 7.

Thanks and I'll think about it again.

And welcome your opinion.

hit9 commented 1 year ago

I have implemented in C for the standard mode in this commit https://github.com/hit9/bitproto/commit/99f18248d63fa6ec38601b68d89c909c0733fe2a

It works well but the benchmark on stm32 shows that the encoding/decoding per call costs about 10us more now.

We have to run a bit AND operation at first to clear the target bits buffer, which increases a little the performance cost.

zzz00yx commented 1 year ago

I added a comment to your commit : ) https://github.com/hit9/bitproto/commit/99f18248d63fa6ec38601b68d89c909c0733fe2a#r107054190