LLNL / H5Z-ZFP

A registered ZFP compression plugin for HDF5
Other
49 stars 22 forks source link

Help using H5Z-ZFP with netCDF #11

Closed mathomp4 closed 5 years ago

mathomp4 commented 5 years ago

All,

This might be something never tried before, but I'm hoping I can get help here. With @lindstro's help, I was able to compile both zfp 0.5.2 and H5Z-ZFP and I think I did so correctly. I then, echoing the example noted on this page:

https://www.unidata.ucar.edu/software/netcdf/docs/md__Users_wfisher_Desktop_v4_86_82_netcdf-c_docs_filters.html#NCCOPY

tried to do an nccopy using zfp, but:

(133) $ setenv HDF5_PLUGIN_PATH $SITEAM/Baselibs/ESMA-Baselibs-5.2.0-ZFPTry/x86_64-unknown-linux-gnu/ifort_18.0.3.222-openmpi_3.1.0-gcc_6.3.0/Linux/H5Z-ZFP/plugin
(134) $ $SITEAM/Baselibs/ESMA-Baselibs-5.2.0-ZFPTry/x86_64-unknown-linux-gnu/ifort_18.0.3.222-openmpi_3.1.0-gcc_6.3.0/Linux/bin/nccopy -F 'T,32013,6,3,0,3539053052,1062232653,0,0' stock-JU-2018Sep27-1day-c12.geosgcm_prog.20000415_0000z.nc4 test.nc4
HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 0:
  #000: H5E.c line 607 in H5Eget_class_name(): not a error class ID
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 0:
  #000: H5T.c line 1876 in H5Tget_class(): not a datatype
    major: Invalid arguments to routine
    minor: Inappropriate type
NetCDF: HDF error
Location: file nccopy.c; line 1886
HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 0:
  #000: H5Z.c line 366 in H5Zunregister(): unable to unregister filter
    major: Data filters
    minor: Unable to initialize object
  #001: H5Z.c line 401 in H5Z__unregister(): filter is not registered
    major: Data filters
    minor: Object not found

So obviously I don't know what I'm doing.

I hope you don't mind helping me out here. It's entirely possible I didn't even get the builds correct.

markcmiller86 commented 5 years ago

I'm in a hack-a-thon right now. I think I know what is happening. This is related to error handling setup logic in the filter. Nothing you are doing wrong. Also, haven't used 1.10.4 yet so could be related to incompatability there. Will look at later tomorrow. Pls ping me by Monday if you don't hear anything

mathomp4 commented 5 years ago

@markcmiller86 Okay. Though, I did try the make check in the package and it wasn't happy either, which might mean I have problems elsewhere:

/ford1/share/gmao_SIteam/Baselibs/ESMA-Baselibs-5.2.0-ZFPTry/x86_64-unknown-linux-gnu/ifort_18.0.3.222-openmpi_3.1.0-gcc_6.3.0/Linux/bin/h5pcc -c test_write.c -o test_write_plugin.o -DH5Z_ZFP_USE_PLUGIN -fPIC -I../src -I/ford1/share/gmao_SIteam/Baselibs/ESMA-Baselibs-5.2.0-ZFPTry/x86_64-unknown-linux-gnu/ifort_18.0.3.222-openmpi_3.1.0-gcc_6.3.0/Linux/zfp/include -I/ford1/share/gmao_SIteam/Baselibs/ESMA-Baselibs-5.2.0-ZFPTry/x86_64-unknown-linux-gnu/ifort_18.0.3.222-openmpi_3.1.0-gcc_6.3.0/Linux/include/hdf5
/ford1/share/gmao_SIteam/Baselibs/ESMA-Baselibs-5.2.0-ZFPTry/x86_64-unknown-linux-gnu/ifort_18.0.3.222-openmpi_3.1.0-gcc_6.3.0/Linux/bin/h5pcc test_write_plugin.o -o test_write_plugin -Wl,-rpath,/ford1/share/gmao_SIteam/Baselibs/ESMA-Baselibs-5.2.0-ZFPTry/x86_64-unknown-linux-gnu/ifort_18.0.3.222-openmpi_3.1.0-gcc_6.3.0/Linux/lib -Wl,-rpath,/ford1/share/gmao_SIteam/Baselibs/ESMA-Baselibs-5.2.0-ZFPTry/x86_64-unknown-linux-gnu/ifort_18.0.3.222-openmpi_3.1.0-gcc_6.3.0/Linux/zfp/lib64 -L/ford1/share/gmao_SIteam/Baselibs/ESMA-Baselibs-5.2.0-ZFPTry/x86_64-unknown-linux-gnu/ifort_18.0.3.222-openmpi_3.1.0-gcc_6.3.0/Linux/lib -L/ford1/share/gmao_SIteam/Baselibs/ESMA-Baselibs-5.2.0-ZFPTry/x86_64-unknown-linux-gnu/ifort_18.0.3.222-openmpi_3.1.0-gcc_6.3.0/Linux/zfp/lib64 -lhdf5 -lzfp -lm
HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 0:
  #000: H5E.c line 607 in H5Eget_class_name(): not a error class ID
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 0:
  #000: H5S.c line 1013 in H5Sget_simple_extent_dims(): not a dataspace
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 0:
  #000: H5D.c line 145 in H5Dcreate2(): unable to create dataset
    major: Dataset
    minor: Unable to initialize object
  #001: H5Dint.c line 326 in H5D__create_named(): unable to create and link to dataset
    major: Dataset
    minor: Unable to initialize object
  #002: H5L.c line 1572 in H5L_link_object(): unable to create new link to object
    major: Links
    minor: Unable to initialize object
  #003: H5L.c line 1813 in H5L__create_real(): can't insert link
    major: Links
    minor: Unable to insert object
  #004: H5Gtraverse.c line 851 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #005: H5Gtraverse.c line 627 in H5G__traverse_real(): traversal operator failed
    major: Symbol table
    minor: Callback failed
  #006: H5L.c line 1619 in H5L__link_cb(): unable to create object
    major: Links
    minor: Unable to initialize object
  #007: H5Oint.c line 2645 in H5O_obj_create(): unable to open object
    major: Object header
    minor: Can't open object
  #008: H5Doh.c line 300 in H5O__dset_create(): unable to create dataset
    major: Dataset
    minor: Unable to initialize object
  #009: H5Dint.c line 1026 in H5D__create(): I/O filters can't operate on this dataset
    major: Invalid arguments to routine
    minor: Unable to initialize object
  #010: H5Z.c line 894 in H5Z_can_apply(): unable to apply filter
    major: Data filters
    minor: Error from filter 'can apply' callback
  #011: H5Z.c line 854 in H5Z_prepare_prelude_callback_dcpl(): unable to apply filter
    major: Data filters
    minor: Error from filter 'can apply' callback
  #012: H5Z.c line 754 in H5Z_prelude_callback(): error during user callback
    major: Data filters
    minor: Error from filter 'can apply' callback
H5Dcreate failed at line 393, errno=2 (No such file or directory)
HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 0:
  #000: H5Z.c line 366 in H5Zunregister(): unable to unregister filter
    major: Data filters
    minor: Unable to initialize object
  #001: H5Z.c line 401 in H5Z__unregister(): filter is not registered
    major: Data filters
    minor: Object not found
/ford1/share/gmao_SIteam/Baselibs/ESMA-Baselibs-5.2.0-ZFPTry/x86_64-unknown-linux-gnu/ifort_18.0.3.222-openmpi_3.1.0-gcc_6.3.0/Linux/bin/h5dump: error while loading shared libraries: libmpi.so.40: cannot open shared object file: No such file or directory
ZFP rate test failed for rate=32
make[1]: *** [test-rate] Error 1
make[1]: Leaving directory `/ford1/share/gmao_SIteam/Baselibs/ESMA-Baselibs-5.2.0-ZFPTry/src/H5Z-ZFP/test'
make: *** [check] Error 2
markcmiller86 commented 5 years ago

I think it is same problem. And, I haven't worked with HDF5 1.10.4 yet. So, that could be part of it.

mathomp4 commented 5 years ago

@markcmiller86 Good to know!

I am a bugfinder. I seem to have found one between netCDF and NCO as well with netCDF 4.6.2, so I seem to be in fine fettle at present. :)

markcmiller86 commented 5 years ago

@mathomp4, I've tried HDF5-1.10.4 on my OSX system with clang and GNU compilers. It all works without the issues you are encountering. So, I am thinking this has to do with intel compilers and/or openmpi.

Can you please provide the command you used to configure HDF5 and also include compiler version information.

Also, looking closer at your nccopy command, where did you arrive at the argument 'T,32013,6,3,0,3539053052,1062232653,0,0'? This appears to be control inputs for the H5Z-ZFP filter. However, as described in the last paragraph of H5Z-ZFP generic interface, you cannot use the values gleaned from h5dump or h5ls to setup the generic cd_values controls to HDF5.

mathomp4 commented 5 years ago

@markcmiller86 The compiler would have been GCC 6.3.0 for C, Intel Fortran 18.0.3, and Open MPI 3.1.0 on a CentOS 7 desktop.

As for configure, hmm, I'll try and get to my work desktop and see (contractor for NASA so whether or not systems are up and running is tossup!). It's probably a bit weird as our Base library collection has been around a while.

That said, I am working on a MacBook now, so I am going to try building it here as well. Just need to get the changed I made for ZFP and H5Z-ZFP. Maybe I got those wrong too!

mathomp4 commented 5 years ago

Okay. Here we go. I just tried building zfp and H5Z-ZFP on my macbook. First, how I built things:

hdf5:

$ ./configure --prefix=/Users/mathomp4/installed/MPI/gcc-gfortran-8.2.0/openmpi-3.1.3/Baselibs/5.2.1-ZFP/Darwin \
--includedir=/Users/mathomp4/installed/MPI/gcc-gfortran-8.2.0/openmpi-3.1.3/Baselibs/5.2.1-ZFP/Darwin/include/hdf5 \
--with-szlib=/Users/mathomp4/installed/MPI/gcc-gfortran-8.2.0/openmpi-3.1.3/Baselibs/5.2.1-ZFP/Darwin/include/szlib,/Users/mathomp4/installed/MPI/gcc-gfortran-8.2.0/openmpi-3.1.3/Baselibs/5.2.1-ZFP/Darwin/lib \
--with-zlib=/Users/mathomp4/installed/MPI/gcc-gfortran-8.2.0/openmpi-3.1.3/Baselibs/5.2.1-ZFP/Darwin/include/zlib,/Users/mathomp4/installed/MPI/gcc-gfortran-8.2.0/openmpi-3.1.3/Baselibs/5.2.1-ZFP/Darwin/lib \
--disable-shared --disable-cxx --enable-hl --enable-fortran \
--disable-sharedlib-rpath --enable-parallel --enable-fortran2003 \
CFLAGS= FCFLAGS= CC=mpicc FC=mpifort CXX=mpic++ F77=mpifort

zfp:

cmake -DCMAKE_INSTALL_PREFIX=$(prefix)/zfp \
                -DZFP_BIT_STREAM_WORD_SIZE=8 -DBUILD_SHARED_LIBS=OFF .. 

H5Z-ZFP:

      $(MAKE) \
          CC=$(H5_CC) FC=$(H5_FC) \
          PREFIX=$(prefix)/H5Z-ZFP \
          HDF5_HOME=$(prefix) \
          ZFP_HOME=$(prefix)/zfp all; \
      $(MAKE) \
          CC=$(H5_CC) FC=$(H5_FC) \
          PREFIX=$(prefix)/H5Z-ZFP \
          HDF5_HOME=$(prefix) \
          ZFP_HOME=$(prefix)/zfp install
mathomp4 commented 5 years ago

zfp does not pass its ctest because of the word size I think:

test 1
    Start 1: small-arrays-1d-fp32

1: Test command: /Users/mathomp4/Baselibs/ESMA-Baselibs-5.2.1-ZFP/src/zfp/build/bin/testzfp "small" "1d" "fp32"
1: Test timeout computed to be: 1500
1: zfp version 0.5.2 (September 28, 2017)
1: library version 82
1: CODEC version 5
1: data model LP64
1:
1: regression testing requires BIT_STREAM_WORD_TYPE=uint64
1/6 Test #1: small-arrays-1d-fp32 .............***Failed    0.02 sec
mathomp4 commented 5 years ago

And when I try to do the make check for H5Z-ZFP:

/Users/mathomp4/installed/MPI/gcc-gfortran-8.2.0/openmpi-3.1.3/Baselibs/5.2.1-ZFP/Darwin/bin/h5pcc test_write_plugin.o -o test_write_plugin -Wl,-rpath,/Users/mathomp4/installed/MPI/gcc-gfortran-8.2.0/openmpi-3.1.3/Baselibs/5.2.1-ZFP/Darwin/lib -Wl,-rpath,/Users/mathomp4/installed/MPI/gcc-gfortran-8.2.0/openmpi-3.1.3/Baselibs/5.2.1-ZFP/Darwin/zfp/lib -L/Users/mathomp4/installed/MPI/gcc-gfortran-8.2.0/openmpi-3.1.3/Baselibs/5.2.1-ZFP/Darwin/lib -L/Users/mathomp4/installed/MPI/gcc-gfortran-8.2.0/openmpi-3.1.3/Baselibs/5.2.1-ZFP/Darwin/zfp/lib -lhdf5 -lzfp -lm
HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 0:
  #000: H5E.c line 607 in H5Eget_class_name(): not a error class ID
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 0:
  #000: H5S.c line 1013 in H5Sget_simple_extent_dims(): not a dataspace
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 0:
  #000: H5D.c line 145 in H5Dcreate2(): unable to create dataset
    major: Dataset
    minor: Unable to initialize object
  #001: H5Dint.c line 326 in H5D__create_named(): unable to create and link to dataset
    major: Dataset
    minor: Unable to initialize object
  #002: H5L.c line 1572 in H5L_link_object(): unable to create new link to object
    major: Links
    minor: Unable to initialize object
  #003: H5L.c line 1813 in H5L__create_real(): can't insert link
    major: Links
    minor: Unable to insert object
  #004: H5Gtraverse.c line 851 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #005: H5Gtraverse.c line 627 in H5G__traverse_real(): traversal operator failed
    major: Symbol table
    minor: Callback failed
  #006: H5L.c line 1619 in H5L__link_cb(): unable to create object
    major: Links
    minor: Unable to initialize object
  #007: H5Oint.c line 2645 in H5O_obj_create(): unable to open object
    major: Object header
    minor: Can't open object
  #008: H5Doh.c line 300 in H5O__dset_create(): unable to create dataset
    major: Dataset
    minor: Unable to initialize object
  #009: H5Dint.c line 1026 in H5D__create(): I/O filters can't operate on this dataset
    major: Invalid arguments to routine
    minor: Unable to initialize object
  #010: H5Z.c line 894 in H5Z_can_apply(): unable to apply filter
    major: Data filters
    minor: Error from filter 'can apply' callback
  #011: H5Z.c line 854 in H5Z_prepare_prelude_callback_dcpl(): unable to apply filter
    major: Data filters
    minor: Error from filter 'can apply' callback
  #012: H5Z.c line 754 in H5Z_prelude_callback(): error during user callback
    major: Data filters
    minor: Error from filter 'can apply' callback
H5Dcreate failed at line 393, errno=2 (No such file or directory)
HDF5-DIAG: Error detected in HDF5 (1.10.4) thread 0:
  #000: H5Z.c line 366 in H5Zunregister(): unable to unregister filter
    major: Data filters
    minor: Unable to initialize object
  #001: H5Z.c line 401 in H5Z__unregister(): filter is not registered
    major: Data filters
    minor: Object not found
h5dump error: unable to get link info from "compressed"
ZFP rate test failed for rate=32
mathomp4 commented 5 years ago

All of these latest tests are GCC 8.2.0 and Open MPI 3.1.3 built by hand on macOS 10.13.6

markcmiller86 commented 5 years ago

I forgot to ask...which version of H5Z-ZFP are you using? Have you tried the current master?

mathomp4 commented 5 years ago

I grabbed 0.9.0 this morning, as it was the last release. I'll give master a test tomorrow.

Also, I am using zfp 0.5.2. Is that still the right version to use?

markcmiller86 commented 5 years ago

Hmm. Those should be fine. I will have to get into an icc machine here and see if I can reproduce.

mathomp4 commented 5 years ago

Well, my tests today had no Intel at all. Just pure GCC (not even clang) and Open MPI. Still get the same error.

markcmiller86 commented 5 years ago

Ok, I think I know what is going on. I am using an HDF5 run-tine symbol before it has been initialized, at least in some cases. I am going to get an HDF5 expert to comment on my code. I hope to have a PR later this week.

markcmiller86 commented 5 years ago

@mathomp4...I just merged a PR which I think may fix your issue. Can you please give the newest (master) a try?

markcmiller86 commented 5 years ago

I believe I have corected this with previous update.