Unidata / netcdf-c

Official GitHub repository for netCDF-C libraries and utilities.
BSD 3-Clause "New" or "Revised" License
511 stars 263 forks source link

make check fails for netCDF-C 4.9.2 (Docker) build #2919

Closed sr-murthy closed 1 month ago

sr-murthy commented 5 months ago

I'm trying to build a Docker image (Ubuntu 18.04) with netCDF-C 4.9.2 from source - I'm following the installation guide for building it with HDF4 support. In the Dockerfile here are the chained instructions for the netCDF build:

...
# Build netCDF C lib.
ARG NETCDF_C_VER=4.9.2
RUN wget https://github.com/Unidata/netcdf-c/archive/refs/tags/v${NETCDF_C_VER}.tar.gz && \
    tar zxvf v${NETCDF_C_VER}.tar.gz && \
    cd netcdf-c-${NETCDF_C_VER} && \
    CPPFLAGS="-Ihdf5-${HDF5_VER}/include -Ihdf-${HDF4_VER}-2/include" \
    LDFLAGS="-Lhdf5-${HDF5_VER}/lib -Lhdf-${HDF4_VER}-2/lib" && \
    ./configure --prefix=/usr/local --disable-byterange --enable-shared --enable-hdf4 --enable-hdf4-file-tests && \
    make && make check && make install && \
    cd .. && rm -rf netcdf-c-${NETCDF_C_VER} && rm -f v${NETCDF_C_VER}.tar.gz
...

The error occurs during make check:

113.0 Making check in include
113.0 make[1]: Entering directory '/netcdf-c-4.9.2/include'
113.0 make[1]: Nothing to be done for 'check'.
113.0 make[1]: Leaving directory '/netcdf-c-4.9.2/include'
113.0 Making check in libdispatch
113.0 make[1]: Entering directory '/netcdf-c-4.9.2/libdispatch'
113.0 make[1]: Nothing to be done for 'check'.
113.0 make[1]: Leaving directory '/netcdf-c-4.9.2/libdispatch'
113.0 Making check in libsrc
113.0 make[1]: Entering directory '/netcdf-c-4.9.2/libsrc'
113.0 make[1]: Nothing to be done for 'check'.
113.0 make[1]: Leaving directory '/netcdf-c-4.9.2/libsrc'
113.0 Making check in libsrc4
113.1 make[1]: Entering directory '/netcdf-c-4.9.2/libsrc4'
113.1 make[1]: Nothing to be done for 'check'.
113.1 make[1]: Leaving directory '/netcdf-c-4.9.2/libsrc4'
113.1 Making check in libhdf4
113.1 make[1]: Entering directory '/netcdf-c-4.9.2/libhdf4'
113.1 make[1]: Nothing to be done for 'check'.
113.1 make[1]: Leaving directory '/netcdf-c-4.9.2/libhdf4'
113.1 Making check in libhdf5
113.1 make[1]: Entering directory '/netcdf-c-4.9.2/libhdf5'
113.1 make[1]: Nothing to be done for 'check'.
113.1 make[1]: Leaving directory '/netcdf-c-4.9.2/libhdf5'
113.1 Making check in libncpoco
113.1 make[1]: Entering directory '/netcdf-c-4.9.2/libncpoco'
113.1 make[1]: Nothing to be done for 'check'.
113.1 make[1]: Leaving directory '/netcdf-c-4.9.2/libncpoco'
113.1 Making check in libnczarr
113.2 make[1]: Entering directory '/netcdf-c-4.9.2/libnczarr'
113.2 make[1]: Nothing to be done for 'check'.
113.2 make[1]: Leaving directory '/netcdf-c-4.9.2/libnczarr'
113.2 Making check in liblib
113.2 make[1]: Entering directory '/netcdf-c-4.9.2/liblib'
113.2 make[1]: Nothing to be done for 'check'.
113.2 make[1]: Leaving directory '/netcdf-c-4.9.2/liblib'
113.2 Making check in ncgen3
113.2 make[1]: Entering directory '/netcdf-c-4.9.2/ncgen3'
113.2 make  check-TESTS
113.2 make[2]: Entering directory '/netcdf-c-4.9.2/ncgen3'
113.2 make[3]: Entering directory '/netcdf-c-4.9.2/ncgen3'
113.3 PASS: run_tests.sh
113.3 FAIL: run_nc4_tests.sh
113.4 ============================================================================
113.4 Testsuite summary for netCDF 4.9.2
113.4 ============================================================================
113.4 # TOTAL: 2
113.4 # PASS:  1
113.4 # SKIP:  0
113.4 # XFAIL: 0
113.4 # FAIL:  1
113.4 # XPASS: 0
113.4 # ERROR: 0
113.4 ============================================================================
113.4 See ncgen3/test-suite.log
113.4 Please report to support-netcdf@unidata.ucar.edu
113.4 ============================================================================
113.4 Makefile:938: recipe for target 'test-suite.log' failed
113.4 make[3]: *** [test-suite.log] Error 1
113.4 make[2]: *** [check-TESTS] Error 2
113.4 make[3]: Leaving directory '/netcdf-c-4.9.2/ncgen3'
113.4 Makefile:1044: recipe for target 'check-TESTS' failed
113.4 make[2]: Leaving directory '/netcdf-c-4.9.2/ncgen3'
113.4 Makefile:1126: recipe for target 'check-am' failed
113.4 make[1]: Leaving directory '/netcdf-c-4.9.2/ncgen3'
113.4 make[1]: *** [check-am] Error 2
113.4 make: *** [check-recursive] Error 1
113.4 Makefile:769: recipe for target 'check-recursive' failed

Earlier instructions build HDF4 (4.2.16) and HDF5 (1.14.4-2) from source, successfully. Here are the chained instructions for those:

#Build HDF4 lib.
ARG HDF4_VER=4.2.16
RUN wget https://hdf-wordpress-1.s3.amazonaws.com/wp-content/uploads/manual/HDF4/HDF${HDF4_VER}-2/src/hdf-${HDF4_VER}-2.tar.gz && \
    tar zxvf hdf-${HDF4_VER}-2.tar.gz && \
    cd hdf-${HDF4_VER}-2 && \
    ./configure --prefix=/usr/local/ --enable-shared --disable-netcdf --disable-fortran && \
    make && make check && make install && \
    cd .. && \
    rm -f hdf-${HDF4_VER}-2.tar.gz
...
# Build HDF5 lib.
ARG HDF5_VER=1.14.4-2
ARG HDF5_DOTVER=1.14.4.2
RUN wget https://github.com/HDFGroup/hdf5/releases/download/hdf5_${HDF5_DOTVER}/hdf5-${HDF5_VER}.tar.gz && \
    tar zxvf hdf5-${HDF5_VER}.tar.gz && \
    cd hdf5-${HDF5_VER} && \
    ./configure --prefix=/usr/local/ && \
    make && make check && make install && \
    cd .. && \
    rm -f hdf5-${HDF5_VER}.tar.gz

My guess is there is some inconsistency in the way in which configure and make have been run for HDF4 vs netCDF, but I'm not sure.

P. S. Of course 18.04 is deprecated, but I need to build one for it.

WardF commented 4 months ago

Do you happen to have the config.log and test-suite.log files generated? That would shed a lot of light on what's happening under the hood.

sr-murthy commented 4 months ago

Those would be inside the Docker container, but I could extract that and get back to you.

sr-murthy commented 4 months ago

@WardF Here are the files extracted from the (Ubuntu 18.04) container - I just ran the build steps inside the container and reproduced the error.

test-suite.log config.log

P. S. The Dockerfile I'm using is here. Along with HDF4 and HDF5, I'm also trying to build HDF-EOS2 and H4CF from source. So the (attempted) build order inside the container is:

  1. HDF4
  2. HDF-EOS2
  3. HDF5
  4. netCDF-C
  5. H4CF

Steps 1-3 succeed - it's step #4 (netCDF-C) that fails.

WardF commented 4 months ago

Thanks, I'll take a look at this. Do you happen to have the apt.txt file referenced by the Dockerfile you linked to? Also, is there any chance you have the test-suite.log file for the failing tests? The one you shared shows no errors.

sr-murthy commented 4 months ago

@WardF The apt file is here..

Yes I can see the test-suite.log didn't contain any errors, which is why I raised the issue - the chained build step clearly failed on make check. Would there be a separate log file make check, and where would I look for it?

sr-murthy commented 4 months ago

I will try it again, and see if I can find anything else that can pinpoint the make check errors.

sr-murthy commented 4 months ago

@WardF I believe I am running configure correctly, but do you see any issues with that?

WardF commented 1 month ago

It looks like you are running configure correctly, my apologies for the delay and my thanks for your patience. I am circling back around to this.

sr-murthy commented 1 month ago

@WardF Thanks, not a problem. I did try and find other files in the container that might indicate the cause of the problem, but no luck I'm afraid.

It's not urgent.

WardF commented 1 month ago

I'm at a bit of a loss. If I build the container from a modified Dockerfile which doesn't build netcdf, and then run the container and build netCDF using the same arguments you are using in your script, it builds fine and the tests all pass. When it is automated, they do not, for reasons I'll have to keep investigating. I also don't observe this behavior in any of the Docker images we use for regression tests. I'll keep poking at this, but no obvious answer outside of "something in Docker" is immediately suggesting itself.

sr-murthy commented 1 month ago

Thanks @WardF. I might look at this again if I have time, but I'm a little doubtful now about any possible resolution.

I'll keep this open for the moment, but at some point I suppose I may have to close it.

aafaque33 commented 1 month ago

@sr-murthy I was having same build issue in the Docker and realized it was because of the ftp site not reachable (it happens when it's trying to download the hdf4 files because of --enable-hdf4-file-tests in your docker file) . Just sharing if your issue is same ? https://github.com/Unidata/netcdf-c/issues/2951

Note: the issue mentioned the netcdf v4.7.x but I had same issue building 4.9.2

If above shared issue is the case you need following in your Dockerfile (after configure and before make check)

# Alternate fix for using https instead of ftp until netcdf-c library doesn't release officially
# ftp no longer works
&& sed -i 's~ftp://ftp.unidata.ucar.edu/pub/netcdf/sample_data/hdf4/$1.gz~https://resources.unidata.ucar.edu/netcdf/sample_data/hdf4/$1.gz~g' hdf4_test/run_get_hdf4_files.sh \
&& sed -i 's~FTPFILE~DATAFILE~g' hdf4_test/run_get_hdf4_files.sh 
sr-murthy commented 1 month ago

@aafaque33 I can't see an FTP related error in the original build failure - from the logs it appears to be an ncgen3-related test failure.

WardF commented 1 month ago

Agreed, while the change in the ftp site was responsible for some errors, this one is recreatible on my end but only when building the image. If I modify the dockerfile to install everything before netCDF, and then run the dockerfile and install netCDF manually, the test failures do not occur (at least, in my local environment).

WardF commented 1 month ago

Files:

Summary:

I'm leaning towards closing this out because it does not appear that the issue is with netCDF itself, but something in the docker environment.

I've created a modified Dockerfile, Dockerfile.slim, and a companion script file, script_build.sh. All of the provisioning has been offloaded into the bash script; when I build an image from this Dockerfile, connect to it, and invoke script_build.sh, everything completes successfully except h4cflib. Right now the error message is Can't link against gctp library in hdfeos2 library, but that happens after the successful build and test of libnetcdf, so I will leave it to you to investigate.

I have no idea why this provisioning fails when it is baked into the Dockerfile, but I feel that this demonstrates it is not something with netCDF.

I've attached the files here; I'd be curious to hear what happens on your end @sr-murthy. My process was pretty straightforward:

$ docker build -t tmp -f Dockerfile.slim .
$ docker run --rm -it tmp bash
(inside the running container) $ ./script_build.sh

I hope this helps! I'm going to close out this issue, but I'll gladly reopen if it so transpires that there is still a demonstrable issue with netCDF.

sr-murthy commented 1 month ago

Thanks @WardF. It seems possible from what you're saying that the issue is not related to netCDF, but Docker. I will look at this again to see if I've missed anything. In any case, I will also try out your Dockerfile build, with an updated version of Docker.

I would mention that my recent version of the Dockerfile which led to this issue is more or less identical to the previous version 5 years ago, when everything worked. Obviously, since then we have had newer versions of all of the libraries I'm attempting to build, as well as newer versions of Docker, but I didn't expect this to change the build result.