google / flatbuffers

FlatBuffers: Memory Efficient Serialization Library
https://flatbuffers.dev/
Apache License 2.0
23.28k stars 3.25k forks source link

whether do flatbuffers make limit in field's size(64K) #3938

Closed loachfish closed 8 years ago

loachfish commented 8 years ago
table NDyInfo
{
        ver:string;
        code1:uint;
        code2:uint;
        code3:uint;
        data:string; // fill with table NDxItem
}

table NDxItem
{
        xxLen:uint;
        data:[ubyte];
}

when the NDxItem->data ->size larger then 0x0000FFFF, the code1 code2 code3 value changed. less than 0x0000FFFF is OK.

whether do flatbuffers make limit in field's size?

loachfish commented 8 years ago

found that: in class FlatBufferBuilder the function: uoffset_t EndTable(uoffset_t start, voffset_t numfields) auto pos = static_cast(vtableoffsetloc - field_location->off);

cast the postision from uoffset_t(uint32_t) to voffset_t (uint16_t).

can I redefine the voffset_t to uint32_t?

loachfish commented 8 years ago

if want to fill the fields size > 64KB, how to deal with this.

I. modify to support filed size > 64KB in flatbuffers.h // format forks to save a bit of space if desired. typedef uint32_t voffset_t; // redefine voffset_t

in function: uoffset_t EndTable(uoffset_t start, voffset_t numfields) assert(table_object_size < 0xffff0000); // Vtable use 32bit offsets.

II. use the macro to get the selection in Makefile, like,

ifdef USE_8BIT

typedef uint8_t voffset_t;

else if USE_16BIT

typedef uint16_t voffset_t;

esle if USE_32BIT

typedef uint32_t voffset_t;

else

typedef uint16_t voffset_t;

else


ghost commented 8 years ago

Can you show your serialization code?

You are likely encoding the FlatBuffer wrong. Make sure the string/vector is created before the table.

C++ should give you an assert when you get this wrong. Run your code in debug mode.

No need to change voffset_t.

loachfish commented 8 years ago

NDxProtocolData.fbs

enum NDxCommandType : ubyte
{
    Unknown = -1,
    PointToPoint = 1,
    GroupBroadcast = 2,
    Broadcast = 3
}

table NDxDataHeader
{
    cmd:uint = 4294967295;
    from:uint = 4294967295;
    to:uint = 4294967295;
    cmdType:NDxCommandType = Unknown;
    devNo:uint = 4294967295;
    fromMac:uint = 4294967295;
    toMac:uint = 4294967295;
}

table NDxProtocolData
{
    version:string;
    head:NDxDataHeader;
    data:string;
}

root_type NDxProtocolData;

NDySubCommand.fbs

table NDySubCommand
{
    dataLen:uint;
    data:[byte];
}

root_type NDySubCommand;

Encode

void sendySubCommand(uint32_t size_)
{
    flatbuffers::FlatBufferBuilder fbb;
    NDxProtocolDataBuilder pdb(fbb);
    pdb.add_head(CreateNDxDataHeader(fbb, 3/*code1*/, 10000/*from*/, 10086/*to*/, NDxCommandType_PointToPoint, 12/*devNo*/, 168/*fromMac*/, 172/*toMac*/));
    {
        uint32_t buffSize = size_;
        char* buff = new char[buffSize];
        flatbuffers::FlatBufferBuilder fbb2;
        fbb2.Finish(CreateNDySubCommand(fbb2, buffSize, fbb2.CreateVector((int8_t*)(buff), buffSize)));
        pdb.add_data(fbb.CreateString((char*)fbb2.GetBufferPointer(), fbb2.GetSize()));
        delete[]buff;
        buff = NULL;
    }
    fbb.Finish(pdb.Finish());
    // to send data
    // if (NULL != m_sender) {
    //     m_sender->sendData((char*)fbb.GetBufferPointer(), fbb.GetSize());
    // }
}
loachfish commented 8 years ago

It's really the encoding problem.

the right encode sequence:

std::vector<signed char> vec;
vec.clear();
for (int i = 0; i < size_; i++) {
    vec.push_back('D');
}

flatbuffers::FlatBufferBuilder fbb;
auto fbVec = fbb.CreateVector(vec);
NDySubCommandBuilder vr_builer(fbb);
vr_builer.add_dataLen(size_);
vr_builer.add_data(fbVec);
fbb.Finish(vr_builer.Finish());

std::string vrStr("");
vrStr.append((char*)fbb.GetBufferPointer(), fbb.GetSize());

flatbuffers::FlatBufferBuilder fbb_2;

auto fbStr = fbb_2.CreateString(vrStr.c_str(), vrStr.size());
auto header = extavc::CreateNDxDataHeader(fbb_2,
                3, 10000, 10086, NDxCommandType_PointToPoint, 
                12, 168, 172);

NDxProtocolDataBuilder pdb(fbb_2);
pdb.add_head(header);

{
    pdb.add_data(fbStr);
}

fbb_2.Finish(pdb.Finish());
std::string outData((char*)fbb_2.GetBufferPointer(), fbb_2.GetSize());
loachfish commented 8 years ago
  1. why it's ok in first encode sequence when the vector size is less then 0x0000FFFF;
  2. this special encode sequence maybe not easy-using in some scene,
ghost commented 8 years ago
  1. It is not ok. As I said, an assert will happen if you try. You should have gotten the assert from the inner CreateString in your original code. You are apparently developing with asserts off (in release mode), which is a bad idea. Make sure that if you write assert(false) in your program, the program halts/crashes/stops the debugger.
  2. Yes, it is less convenient, but necessary for maximum efficiency.
loachfish commented 8 years ago

thx for your reply.

loachfish commented 8 years ago

The final is that the wrong encoding sequence, and bring out the NotNested assert.

while there are some new questions:

  1. disable the assert, what's the influence when encoding in nested.
  2. in the condition 1(close the assert), we pack/unpack the flatbuffers' all right in the first encoding sequence when the size is less than 64K, either the verifier check. while the size is out of 64K, verifier shows the mistakes. why is 64K?
  3. if make sure the size is less than 64K, and encoding in the first method, and the assert is closed, If there are other hidden problems ?
ghost commented 8 years ago
  1. You will be encoding child-objects inside the table, which sorta works if the child objects are small, but not how the system was designed.
  2. All access to a table goes through a vtable, which uses 16bit offsets. 64k was meant to be enough for a table without its child objects.
  3. Your serialized data will typically be bigger and less efficient to access since you have more vtable duplication, and less cache efficient access.
loachfish commented 8 years ago

got it! thx.