starwing / lua-protobuf

A Lua module to work with Google protobuf
MIT License
1.73k stars 387 forks source link

repeated wrappers are deserialized incorrectly #194

Open git-torrent opened 2 years ago

git-torrent commented 2 years ago

Let's have message:

message Test {
  uint32 int = 1
  repeated uint32 ints = 2;
  repeated google.protobuf.UInt32Value wrapped_int = 3;
  repeated google.protobuf.UInt32Value wrapped_ints = 4;
}

filled with values

test = {
  int = 0,
  ints = { 0, 0, 0 },
  wrapped_int = { value = 0 },
  wrapped_ints = { {value = 0}, {value = 0}, {value = 0}}
}

test = pb.decode( pb.encode(test) ) would return:

test = {
  ints = { 0, 0, 0 },
  wrapped_ints = { {}, {}, {} }
}

native and wrapped types are not encoded in the same way. On the other side, google implementation (checked Go, python, C#) decodes this message to:

test = {
  ints = { 0, 0, 0 },
  wrapped_ints = { {value = 0}, {value = 0}, {value = 0} }
}

The same applies for other repeated wrappers. All of them should be intialized to default values.

Thanks you.

starwing commented 2 years ago

This is intentional for now. Because to create tables for empty sub message may decrease the performance. If this is really needed, I can add a option to enable this behavior. Any pull request are welcome.

git-torrent commented 2 years ago

That reminds me of old joke about Pentium FDIV bug - incorrect but fast :)

It is needed if the goal is to comply with google implementation. I would be glad to prepare PR, but I am not that skilled. I can fix the lua side by adding if wrapper then return value or default.

starwing commented 2 years ago

This is a history issue. The module is written in the time of protobuf 2.0, at that time the sub message has not default empty value. But the semantic is changed when the protobuf 3 is out. and protobuf3 is designed by C++ semantic. That means to simulate that behaviour from Lua will largely decrease the performance: a message of C++ is just a struct that can be easily cleaned to match the protobuf 3 semantic, but this is not true for Lua.

starwing commented 2 years ago

I just add a new option "decode_default_message" to support decode the empty but with default values sub messages. You could try the master HEAD for see whether it fit your need.

git-torrent commented 2 years ago

Thank you for quick help and explanation.

I looked up the affected code but unfortunately not able to comprehend that. I will wait for release. Thank you again

starwing commented 2 years ago

You say this?

    if (!e->LS->encode_default_values && ignoredlen != 0 && ignorezero)
        e->b->size -= (unsigned)(ignoredlen + hlen);

The protobuf3 do not encode the zero-value (and has not default value, that is a protobuf2 thing). the second line used to cancel the encoded zero-value, if these conditions are meet:

git-torrent commented 2 years ago

Thanks again for explanation. Do I understand it right that it enables encoding of all the default values?

(this is useful option in case of proto->json conversion which I must do manually).

starwing commented 2 years ago

Yes, it will. be.

git-torrent commented 2 years ago

It might help in some scenarios. On the other side, turning it on for not loosing array members might bloat the protobufs, what goes against their efficacy.

We can leave this issue as a request for enhancement.