tomjakubowski / nanopb

Automatically exported from code.google.com/p/nanopb
0 stars 0 forks source link

Nanopb doesn't handle strings larger than 255 characters. #20

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1.
Define message like this:

import "nanopb.proto";
message msg {
  required string string1       = 2 [(nanopb).max_size = 256];
  required string String2       = 3 [(nanopb).max_size = 256];
}

2.
Observe that the value of msg_fields[].data_size will be initialized to zero by 
the macro function pb_membersize(msg, string1) (or string2). This happens 
because data_size is a uint8_t, and setting it to the value 256 is an overflow, 
which rolls data_size back to 0.

3.
The resulting behavior is that when msg is encoded, string1 is encoded twice, 
instead of string1 being encoded, followed by string2. This follows because 
when we get to

bool checkreturn pb_encode(pb_ostream_t *stream, const pb_field_t fields[], 
const void *src_struct)
{
    const pb_field_t *field = fields;
    const void *pData = src_struct;
    const void *pSize;
    size_t prev_size = 0;

    while (field->tag != 0)
    {
        pb_encoder_t func = PB_ENCODERS[PB_LTYPE(field->type)];
        pData = (const char*)pData + prev_size + field->data_offset;
        pSize = (const char*)pData + field->size_offset;

        prev_size = field->data_size;

prev_size is set to 0, and the subsequent iteration through the loop doesn't 
move the pointer to the data, and so we simply encode string1 again.

What is the expected output? What do you see instead?
I'd expect one of two possibilities.

One possibility is that nanopb should work correctly with strings of size 256 
or larger. To do this, I suspect you'll have to increase the size of uint8_t 
data_size; Maybe make it a size_t instead of a uint8_t. 

The second possibility is that when generating the .c and .h files from the 
.proto/.pb files, the generator code could emit a warning saying that we'll get 
weird behavior. 

What version of the product are you using? On what operating system?
Scientific Linux 6.2 (32 bit), and I'm using release 0.1.3.

Please provide any additional information below.
We aren't using nanopb in a memory constrained environment, so we don't really 
mind the extra memory overhead of using size_t. One thought is to define 
#define pb_size_type, and default to size_t (or uint8_t, either way), but let 
the end user pass a compiler flag -DUSE_SMALL_SIZE_TYPE to switch pb_size_type 
from size_t to uint8_t, for those users who have memory constrained 
environments.

Original issue reported on code.google.com by jone...@jonesmz.com on 5 Jul 2012 at 4:33

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
Oh, related to this, I've got multiple messages, each containing strings, that 
I'm using in a "union" type message. Nanopb is very unhappy with me because of 
the same reason, but since I'm using the union_encode paradigm (from the 
union_encode example directory, thanks!) this variant of the bug doesn't effect 
me. 

Original comment by jone...@jonesmz.com on 5 Jul 2012 at 4:36

GoogleCodeExporter commented 8 years ago
Duplicate of issue #14. Fixed in newest git master.

Original comment by Petteri.Aimonen on 5 Jul 2012 at 5:07