In a nutshell... bufr_value_set_string(bv,buf,len) assigns a string field of arbitrary length to a BufrValue. Internally, it ensures that the string is padded out to len with blanks (i.e. a C string of len-3 will be padded to len bytes).
However the encoder operates under the assumption that len is the same as the descriptor datawidth. If the user cough naively believes that the len==strlen(buf), then encoder dumps some random chunk of memory into the output.
Besides being a potential buffer overflow, this breaks things like BUFR compression (see the "differs check" in bufr_put_ccitt_compressed) and generally leads to BUFR binary messages where identical inputs/code leads to differing outputs.
It also means more management overhead for API users... BufrValue objects are frequently used standalone, with the various set/get functions implicitly converting types as needed and generally hiding the BUFR type details. Requiring the caller to "know" the datawidth means they have to keep tables around everywhere.
Simplest fix is to introduce a
bufr_put_padstring(BUFR_Message bufr, const char str, int len, int enclen)
function which implicitly blank pads output strings when the len < enclen.
In a nutshell... bufr_value_set_string(bv,buf,len) assigns a string field of arbitrary length to a BufrValue. Internally, it ensures that the string is padded out to len with blanks (i.e. a C string of len-3 will be padded to len bytes).
However the encoder operates under the assumption that len is the same as the descriptor datawidth. If the user cough naively believes that the len==strlen(buf), then encoder dumps some random chunk of memory into the output.
Besides being a potential buffer overflow, this breaks things like BUFR compression (see the "differs check" in bufr_put_ccitt_compressed) and generally leads to BUFR binary messages where identical inputs/code leads to differing outputs.
It also means more management overhead for API users... BufrValue objects are frequently used standalone, with the various set/get functions implicitly converting types as needed and generally hiding the BUFR type details. Requiring the caller to "know" the datawidth means they have to keep tables around everywhere.
Simplest fix is to introduce a
bufr_put_padstring(BUFR_Message bufr, const char str, int len, int enclen)
function which implicitly blank pads output strings when the len < enclen.
Imported from Launchpad using lp2gh.