ComputationalRadiationPhysics / libSplash

libSplash - Simple Parallel file output Library for Accumulating Simulation data using Hdf5
GNU Lesser General Public License v3.0
15 stars 15 forks source link

Close #184 Compression Header as Bool #199

Closed ax3l closed 9 years ago

ax3l commented 9 years ago

Description

Close #184 by using the new h5py compatible bool type for our /header/compression attribute.

Increases the format by a minor - can be merged directly after #198 (purely informative argument that has no influence on transparent reads of compressed or uncompressed data sets; since last stable release already a major format increase).

Also changes internally to use our ColTypeInt32 instead of the native type for SDC_ATTR_MAX_ID which is not changing the format since they are identical (or at least H5T_NATIVE_INT32 (ANSI C9x) and H5T_INTEL_I32 seem to be identical).

Tests

h5diff output:

$ h5diff -c h5/testWriteRead_0_0_0.h5 ../build/h5/testWriteRead_0_0_0.h5                                                                                                                 
[...]
Not comparable: <compression> is of class H5T_INTEGER and <compression> is of class H5T_ENUM

attribute: <splashFormat of </header>> and <splashFormat of </header>>
1 differences found

h5dump -A h5/testWriteRead_0_0_0.h5: old:

[...]
   GROUP "header" {
      ATTRIBUTE "compression" {
         DATATYPE  H5T_STD_I32LE
         DATASPACE  SCALAR
         DATA {
         (0): 0
         }
      }
[...]
   }
[...]

new (all-CAPS labels depend on #198):

[...]
   GROUP "header" {
      ATTRIBUTE "compression" {
         DATATYPE  H5T_ENUM {
            H5T_STD_I8LE;
            "TRUE"             1;
            "FALSE"            0;
         }
         DATASPACE  SCALAR
         DATA {
         (0): false
         }
      }
[...]
   }
[...]

Python via

python -c 'import h5py as h5; f=h5.File("h5/testWriteRead_0_0_0.h5"); c=f["header"].attrs["compression"]; print( c, type(c) )'

uncompressed: new (False, <type 'numpy.bool_'>) and old both ((0, <type 'numpy.int32'>)) compressed [1]: new (True, <type 'numpy.bool_'>) and old (1, <type 'numpy.int32'>, but value can be arbitrary)

Notes

[1] compressed output in tests used patch:

diff --git a/tests/SimpleDataTest.cpp b/tests/SimpleDataTest.cpp
index 9907663..614282a 100644
--- a/tests/SimpleDataTest.cpp
+++ b/tests/SimpleDataTest.cpp
@@ -72,6 +72,7 @@ bool SimpleDataTest::subtestWriteRead(Dimensions gridSize, Dimensions borderSize
     // write data to file
     DataCollector::FileCreationAttr fileCAttr;
     DataCollector::initFileCreationAttr(fileCAttr);
+    fileCAttr.enableCompression = true;
     dataCollector->open(HDF5_FILE, fileCAttr);

     // initial part of the test: data is written to the file, once with and once

and bool_ representation depends on #198.

f-schmitt commented 9 years ago

@ax3l Do we need to rerun those checks or does this happen automatically?

ax3l commented 9 years ago

no I checked them already manually before opening the PR :) but I will rebase so we are sure.

ax3l commented 9 years ago

@f-schmitt-zih rebased, updated and tested