easybuilders / easybuild-easyconfigs

A collection of easyconfig files that describe which software to build using which build options with EasyBuild.
https://easybuild.io
GNU General Public License v2.0
357 stars 686 forks source link

HDF4 contradictions #12306

Open justbennet opened 3 years ago

justbennet commented 3 years ago

I was updating HDF 4.2.15 to gcc/gfortran 10.2.0 and working through the issues. GCC 10 requires the -fallow-argument-mismatch flag so that HDF can convert this error

Error: Type mismatch between actual argument at (1) and actual argument at (2) (CHARACTER(0)/INTEGER(4)).

to a warning.

After installing the now updated 4.2.15-GCCcore-10.2.0, I tried compiling the test code provided with the HDF installation. C sources build and run. Fortran sources, however, fail with missing references.

[bennetsw@gl-build fortran]$ ./run-fortran-ex.sh 

#################  VD_create_vdatas  #################
VD_create_vdatas.o:VD_create_vdatas.f:function main: error: undefined reference to 'hopen_'
. . . . 
VD_create_vdatas.o:VD_create_vdatas.f:function main: error: undefined reference to 'hclose_'
collect2: error: ld returned 1 exit status
messed up compiling VD_create_vdatas.f

I installed HDF outside of EB and without the --enable-shared --disable-fortran --disable-netcdf options that appear on the second iteration of configopts, and the Fortran sources compile and run fine.

The missing references are in libdf.a. Looking in the hand-installed tree, I find

$ strings /tmp/local/hdf/lib/libdf.a | egrep 'hopen_|hclose'
hclose_
hopen_
hclose_
hopen_

whereas in the EB installed,

$ strings /tmp/local/hdf/lib/libdf.a | egrep 'hopen_|hclose'
$

The second iteration of configure/make install overwrites the results of the first. This was discussed in PR #11847, where it seems to have been assumed that a second configure/make install would only add the shared libraries, whereas it also overwrites the static libraries to match capabilities of both static and dynamic, it seems.

So, I @smoors was right to suggest a suffix-tagged version for GDAL, but not because they will pick up the shared libraries but because the static libraries also have Fortran disabled, so there isn't an HDF usable with Fortran.

Looks like the most recent discussion was only in Dec, 2020, so not much time to discover that Fortran does not work.

Naming the suffixed version might be problematic because two things are being disabled and one enabled, and it is not clear which would be most important. Enabling shared libraries disables Fortran, so that could be encapsulated in a single suffix (-shared_libs or -nofortran), but GDAL also requires --disable-netcdf, and that might introduce a third configuration.

I don't know what else uses the older HDF4 standard. I only know for sure that programs using satellite data do. If that effectively confines use to geospatial, then creating one set for HDF with Fortran and without NetCDF, and one with shared libraries and no Fortran would be adequate.

Suggestions for a) how many configurations and b) how to name them?

bartoldeman commented 6 months ago

HDF4 got a bit of love this year to make it maintainable: https://www.hdfgroup.org/2023/03/release-of-hdf-4-2-16-newsletter-191/ which has a video which explains some of the issues.

Basically the Fortran interface is broken and will be removed, it's not working properly on 64-bit systems, due to mixing 32-bit integers with pointers. That's gone already (by default) in 4.2.16.

As for netCDF, this is a copy of 2.3.2 from 1993, crammed inside. If you use --disable-netcdf it'll keep the netcdf symbols but prefix them with sd_ so e.g. ncclose becomes sd_ncclose. It'll also skip installing the conflicting netcdf.h header file.

If you also use --disable-netcdf-tools, it'll not compile ncview and ncgen which are binaries also present in modern netCDF.

Now for us all main HDF4 users also use modern netCDF (NCL, NCO, and GDAL), so having the old embedded netCDF is problematic and we use --disable-netcdf --disable-netcdf-tools for both static and shared libraries.

In the video presentation the presenter claimed that "somebody at NASA is going to yell at him" for removing ncview and ncgen from HDF4 but I don't think that applies to our use case.

Also tagging @smoors @boegel and @verdurin FYI.