Parallel-NetCDF / PnetCDF

Source code repository of PnetCDF library and utilities
https://parallel-netcdf.github.io
Other
83 stars 23 forks source link

Possible mis-reading from using "strlen" on uninitialized string. #81

Open cponder opened 2 years ago

cponder commented 2 years ago

I'm testing MPAS-A with PNetCDF 1.12.3 and Valgrind is giving me messages like this:

1: ==2752438== Conditional jump or move depends on uninitialised value(s)
1: ==2752438==    at 0x4842658: strlen (vg_replace_strmem.c:494)
1: ==2752438==    by 0x4E2CB74: ncmpio_new_NC_var (ncmpio_var.c:71)
1: ==2752438==    by 0x4E26D8E: hdr_get_NC_var (ncmpio_header_get.c:1067)
1: ==2752438==    by 0x4E24201: ncmpio_hdr_get_NC (ncmpio_header_get.c:1239)

I get ~400 of these messages and am wondering if you can patch this on the PNetCDF side.

cponder commented 2 years ago

The variable name is declared and processed here in the file src/drivers/ncmpio/ncmpio_header_get.c

   1028     char *name;
   1029     NC_var *varp;
   1030 
   1031     /* get name */
   1032     err = hdr_get_NC_name(gbp, &name);
                           ....
   1066     /* allocate space for NC_var object */
   1067     varp = ncmpio_new_NC_var(name, ndims);

The function-call on line 1032 allocates the name here in the same file

    531     /* Allocate a NC_string structure large enough to hold nchars characters.
    532      * Note nchars is strlen(namestring) without terminal character.
    533      */
    534     *namep = (char*) NCI_Malloc((size_t)nchars + 1);
    535     if (*namep == NULL) DEBUG_RETURN_ERROR(NC_ENOMEM)
    536     (*namep)[nchars] = '\0'; /* add terminal character */

By my reading, the NCI_Malloc does not initialize the contents of the array. In the file src/drivers/ncmpio/ncmpio_var.c the strlen gets invoked

     71     varp->name_len = strlen(name); /* name has been NULL checked */

which Valgrind is flagging.

cponder commented 2 years ago

It looks like what you're doing is measuring the length of the array, rather than the length of the string it contains, since it was never assigned a value along this execution-path. You do append a '\0' to the end in the appropriate index to do this. But you can't guarantee that there's not a random '\0' in one of the lower indices of the array, which would give an incorrect value. In particular, if the array were initialized to zero, which is the most common case, this would match '\0' in the first position.

wkliao commented 2 years ago

The name buffer is overwritten in line 552. https://github.com/Parallel-NetCDF/PnetCDF/blob/6765111f11f821daf2358e3cee68cce79776a9fd/src/drivers/ncmpio/ncmpio_header_get.c#L552

Can you dump the header of the file? Also, try ncvaidator with valgrind and see if the same warning appears.

cponder commented 2 years ago

I'm not sure what file is being read here. Can you give me a print-statement and line to insert it to?

wkliao commented 2 years ago

The subroutines shown in valgrind message are called when reading a file. If you don't know what file is being read, then you can add a printf statement at the top of ncmpio_open() in file src/drivers/ncmpio/ncmpio_open.c

wkliao commented 2 years ago

I made some changes to avoid using strlen as much as possible. Could you give the master branch a try and see if valgrind still complains?

cponder commented 2 years ago

I can't figure out how to do the got clone inside my container-build. If you could assign a tag to the current snapshot I'd be able to download the tarball.

wkliao commented 2 years ago

Have you tried the following in your Dockerfile?

RUN git clone https://github.com/Parallel-NetCDF/PnetCDF.git && \
    cd PnetCDF && \
    autoreconf -i && \
    ./configure --prefix=/install/path && \
    make -j8 install

If that does not work, I can either send you a tar ball or create a tag for you.