szcompressor / SZ

Error-bounded Lossy Data Compressor (for floating-point/integer datasets)
http://szcompressor.org
Other
155 stars 56 forks source link

Puzzling error with absolute mode. #108

Closed orioltinto closed 1 year ago

orioltinto commented 1 year ago

Dear developers,

I've been using SZ for a while as a hdf5 filter through python and the hdf5plugin library.

Recently I was setting up some examples and I found a case in which the compressor is failing.

Specifically I was setting up some examples using Xarray's tutorial datasets and for a reason I didn't manage to understand when I use the absolute threshold mode the recovered data the threshold is not respected and the magnitude of the error grows in time (first axis) :

image

Any clue on what can be happening?

I tried to reproduce this error with other datasets and I didn't manage to do it.

The same plot using SZ3 would be: image

The code to produce this plot:

import io

import h5py
import hdf5plugin
import numpy as np
import xarray as xr
from matplotlib import pyplot as plt

def get_xarray_tutorial_data():
    with xr.tutorial.open_dataset("air_temperature") as ds:
        return ds.air.values

TOLERANCE = 2 ** -4

def main():
    # Get encoding for the sz HDF5 filter
    filter_encoding = hdf5plugin.SZ(absolute=TOLERANCE)

    # Get dummy data from xarray tutorial
    data = get_xarray_tutorial_data()

    # open a file in memory
    with io.BytesIO() as bio:
        # Save compressed file
        with h5py.File(bio, mode='w') as temporary_file:
            temporary_file.create_dataset("tmp", data=data, **filter_encoding, chunks=data.shape)
        # Recover data
        with h5py.File(bio, mode='r') as temporary_file:
            recovered_data = temporary_file["tmp"][:]

    # Compute difference
    diff = recovered_data - data
    # Get maximum difference for each time-step
    diff = np.max(np.abs(diff), axis=(1, 2))

    # Plot difference
    plt.plot(diff, label="max diff")

    # Add theoretical threshold for reference
    plt.plot([0, len(diff)], [TOLERANCE, TOLERANCE], ":", label=f"tolerance={TOLERANCE}")

    plt.xlabel("Time")
    plt.ylabel("Max difference")
    plt.yscale("log")
    plt.legend()
    plt.show()
    plt.clf()

if __name__ == "__main__":
    main()

It can run in a container built with:

FROM python:3.11
RUN pip install xarray hdf5plugin h5netcdf pooch matplotlib netCDF4

Thanks in advance.

disheng222 commented 1 year ago

Hi, Is it possible that the data at ith time step are affected by the compression errors in previous time steps (such as (i-1)th)? That is, The plotted issue seems related to the error propagation. If you think the execution was not affected by compression errors, can you generate the one snapshot data in binary format and test the compression using SZ’s executable to see if the error still happens?

Best, Sheng

Oriol Tintó Prims @.***>于2023年6月6日 周二上午6:21写道:

Dear developers,

I've been using SZ for a while as a hdf5 filter through python and the hdf5plugin library.

Recently I was setting up some examples and I found a case in which the compressor is failing.

Specifically I was setting up some examples using Xarray's tutorial datasets and for a reason I didn't manage to understand when I use the absolute threshold mode the recovered data the threshold is not respected and the magnitude of the error grows in time (first axis) :

[image: image] https://user-images.githubusercontent.com/34112954/243690576-846159fc-ac15-4a64-85b5-44a8727b2548.png

Any clue on what can be happening?

I tried to reproduce this error with other datasets and I didn't manage to do it.

The same plot using SZ3 would be: [image: image] https://user-images.githubusercontent.com/34112954/243690908-b66e79ea-85dc-4249-8279-b5c8781eaf00.png

The code to produce this plot:

import io import h5pyimport hdf5pluginimport numpy as npimport xarray as xrfrom matplotlib import pyplot as plt

def get_xarray_tutorial_data(): with xr.tutorial.open_dataset("air_temperature") as ds: return ds.air.values

TOLERANCE = 2 ** -4

def main():

Get encoding for the sz HDF5 filter

filter_encoding = hdf5plugin.SZ(absolute=TOLERANCE)

# Get dummy data from xarray tutorial
data = get_xarray_tutorial_data()

# open a file in memory
with io.BytesIO() as bio:
    # Save compressed file
    with h5py.File(bio, mode='w') as temporary_file:
        temporary_file.create_dataset("tmp", data=data, **filter_encoding, chunks=data.shape)
    # Recover data
    with h5py.File(bio, mode='r') as temporary_file:
        recovered_data = temporary_file["tmp"][:]

# Compute difference
diff = recovered_data - data
# Get maximum difference for each time-step
diff = np.abs(np.max(diff, axis=(1, 2)))

# Plot difference
plt.plot(diff, label="max diff")

# Add theoretical threshold for reference
plt.plot([0, len(diff)], [TOLERANCE, TOLERANCE], ":", label=f"tolerance={TOLERANCE}")

plt.xlabel("Time")
plt.ylabel("Max difference")
plt.yscale("log")
plt.legend()
plt.show()
plt.clf()

if name == "main": main()

It can run in a container built with:

FROM python:3.11 RUN pip install xarray hdf5plugin h5netcdf pooch matplotlib netCDF4

Thanks in advance.

— Reply to this email directly, view it on GitHub https://github.com/szcompressor/SZ/issues/108, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACK3KSPAFUR7MXWGI6EI6T3XJ4OEBANCNFSM6AAAAAAY4KP3O4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Sent from my iPhone

orioltinto commented 1 year ago

Hi @disheng222, thank you for your answer. I tried to reproduce it with sz's binary and it seems that the error does not happen:

image

So if the executable produces the proper results, might it be a problem with the hdf5 filter?

The hdf5plugin python library call only produces the corresponding hdf5 filter arguments. In this specific case: {'compression': 32017, 'compression_opts': (0, 1068498944, 0, 0, 0, 0, 0, 0, 0)}

Best regards, Oriol

See code ```python import numpy as np import xarray as xr from matplotlib import pyplot as plt from OTils import Executable # Get an object to call sz from python. sz = Executable("/scratch/o/Oriol.Tinto/tmp/SZ/install/bin/sz") def get_xarray_tutorial_data(): with xr.tutorial.open_dataset("air_temperature") as ds: return ds.air.values def save_data_array(array, file_name): # Save the numpy array to a binary file with open(file_name, 'wb') as f: array.astype('
robertu94 commented 1 year ago

All,

This shouldn't matter with respect to obeying the error bound, but these do generate slightly different calls to SZ_compress_args:

the HDF5 filter

dataType = 0
data = 0x7fffb8eca010
outSize = 0x7fffffffb3e8
errBoundMode = 0
absErrBound = 0.0625
relBoundRatio = 0
pwrBoundRatio = 0
r5 = 0
r4 = 0
r3 = 2920
r2 = 25
r1 = 53

the command line

dataType = 0
data = 0x7ffff6bf9010
outSize = 0x7fffffffc4d0
errBoundMode = 0
absErrBound = 0.0625
relBoundRatio = 0.0001
pwrBoundRatio = 0.001
r5 = 0
r4 = 0
r3 = 53
r2 = 25
r1 = 2920

We should be able to ignore the differences in relBoundRatio, pwrBoundRatio, and outsize because these are uninitialized in this case in the CLI. The pointer in data will differ as that depends on Malloc. However, the values for r1-r3 are reversed between the two.

Message ID: @.***>

orioltinto commented 1 year ago

Hi @robertu94, thanks for you answer.

Any clue about why this could be happening? I didn't manage to understand where exactly these values are set in the filter code.

I could see with gdb that the dimensions arrive to the filter code in the proper order, not sure about when they are reversed and why.

Best regards,

Oriol

disheng222 commented 1 year ago

Hi Robert, Do you mean the issue is because of the order of data dimensions? If the dimension is messed up, the compression ratio will be definitely 'wrong'. It seems that the user is using a what's called air_temperature dataset. Where did you find this dataset for the test on your side? Thanks.

Best, Sheng

On Tue, Jun 6, 2023 at 10:31 AM Robert Underwood @.***> wrote:

All,

This shouldn't matter with respect to obeying the error bound, but these do generate slightly different calls to SZ_compress_args:

the HDF5 filter

dataType = 0
data = 0x7fffb8eca010
outSize = 0x7fffffffb3e8
errBoundMode = 0
absErrBound = 0.0625
relBoundRatio = 0
pwrBoundRatio = 0
r5 = 0
r4 = 0
r3 = 2920
r2 = 25
r1 = 53

the command line

dataType = 0
data = 0x7ffff6bf9010
outSize = 0x7fffffffc4d0
errBoundMode = 0
absErrBound = 0.0625
relBoundRatio = 0.0001
pwrBoundRatio = 0.001
r5 = 0
r4 = 0
r3 = 53
r2 = 25
r1 = 2920

We should be able to ignore the differences in relBoundRatio, pwrBoundRatio, and outsize because these are uninitialized in this case in the CLI. The pointer in data will differ as that depends on Malloc. However, the values for r1-r3 are reversed between the two.

Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/szcompressor/SZ/issues/108#issuecomment-1578990787, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACK3KSM44I5B4N6VCKNFIQ3XJ5ENLANCNFSM6AAAAAAY4KP3O4 . You are receiving this because you were mentioned.Message ID: @.***>

robertu94 commented 1 year ago

Sheng,

The python code provided downloads it. No, the dimensions being inverted is not the the issue. I can run the dataset with both LibPressio and the CLI in both orders just fine.

Robert

On Sun, Jun 11, 2023 at 00:26 Sheng Di @.***> wrote:

Hi Robert, Do you mean the issue is because of the order of data dimensions? If the dimension is messed up, the compression ratio will be definitely 'wrong'. It seems that the user is using a what's called air_temperature dataset. Where did you find this dataset for the test on your side? Thanks.

Best, Sheng

On Tue, Jun 6, 2023 at 10:31 AM Robert Underwood @.***> wrote:

All,

This shouldn't matter with respect to obeying the error bound, but these do generate slightly different calls to SZ_compress_args:

the HDF5 filter

dataType = 0
data = 0x7fffb8eca010
outSize = 0x7fffffffb3e8
errBoundMode = 0
absErrBound = 0.0625
relBoundRatio = 0
pwrBoundRatio = 0
r5 = 0
r4 = 0
r3 = 2920
r2 = 25
r1 = 53

the command line

dataType = 0
data = 0x7ffff6bf9010
outSize = 0x7fffffffc4d0
errBoundMode = 0
absErrBound = 0.0625
relBoundRatio = 0.0001
pwrBoundRatio = 0.001
r5 = 0
r4 = 0
r3 = 53
r2 = 25
r1 = 2920

We should be able to ignore the differences in relBoundRatio, pwrBoundRatio, and outsize because these are uninitialized in this case in the CLI. The pointer in data will differ as that depends on Malloc. However, the values for r1-r3 are reversed between the two.

Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/szcompressor/SZ/issues/108#issuecomment-1578990787,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/ACK3KSM44I5B4N6VCKNFIQ3XJ5ENLANCNFSM6AAAAAAY4KP3O4>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/szcompressor/SZ/issues/108#issuecomment-1586007815, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2AP5EVICNCLB6OMOP3Y4DXKVCGJANCNFSM6AAAAAAY4KP3O4 . You are receiving this because you were mentioned.Message ID: @.***>

disheng222 commented 1 year ago

I generated the binary data file and tested the executable command (sz), which looks fine in both cases with different dimension orders, as shown below: @. Orioltinto]$ sz -f -x -i test.dat -M ABS -A 0.0625 -3 2920 25 52 -s test.dat.sz -a Min=221, Max=317.399993896484375, range=96.399993896484375 Max absolute error = 0.0625000000 Max relative error = 0.000648 Max pw relative error = 0.000276 PSNR = 68.568258, NRMSE= 0.00037289547905824369 normError = 70.036950, normErr_norm = 0.000128 acEff=0.999998 compressionRatio=4.546184 decompression time = 0.104690 seconds. decompressed data file: test.dat.sz.out @. Orioltinto]$ sz -f -z -i test.dat -M ABS -A 0.0625 -3 52 25 2920 compression time = 0.136057 compressed data file: test.dat.sz @. Orioltinto]$ sz -f -x -i test.dat -M ABS -A 0.0625 -3 52 25 2920 -s test.dat.sz -a Min=221, Max=317.399993896484375, range=96.399993896484375 Max absolute error = 0.0625000000 Max relative error = 0.000648 Max pw relative error = 0.000277 PSNR = 68.568176, NRMSE= 0.00037289899094057274171 normError = 70.037610, normErr_norm = 0.000128 acEff=0.999998 compressionRatio=4.709267 decompression time = 0.107256 seconds. decompressed data file: test.dat.sz.out @. Orioltinto]$

This means the compression code itself has no issue.

I'll check it in more detail later.

Best, Sheng

On Sun, Jun 11, 2023 at 6:29 AM Robert Underwood @.***> wrote:

Sheng,

The python code provided downloads it. No, the dimensions being inverted is not the the issue. I can run the dataset with both LibPressio and the CLI in both orders just fine.

Robert

On Sun, Jun 11, 2023 at 00:26 Sheng Di @.***> wrote:

Hi Robert, Do you mean the issue is because of the order of data dimensions? If the dimension is messed up, the compression ratio will be definitely 'wrong'. It seems that the user is using a what's called air_temperature dataset. Where did you find this dataset for the test on your side? Thanks.

Best, Sheng

On Tue, Jun 6, 2023 at 10:31 AM Robert Underwood @.***> wrote:

All,

This shouldn't matter with respect to obeying the error bound, but these do generate slightly different calls to SZ_compress_args:

the HDF5 filter

dataType = 0
data = 0x7fffb8eca010
outSize = 0x7fffffffb3e8
errBoundMode = 0
absErrBound = 0.0625
relBoundRatio = 0
pwrBoundRatio = 0
r5 = 0
r4 = 0
r3 = 2920
r2 = 25
r1 = 53

the command line

dataType = 0
data = 0x7ffff6bf9010
outSize = 0x7fffffffc4d0
errBoundMode = 0
absErrBound = 0.0625
relBoundRatio = 0.0001
pwrBoundRatio = 0.001
r5 = 0
r4 = 0
r3 = 53
r2 = 25
r1 = 2920

We should be able to ignore the differences in relBoundRatio, pwrBoundRatio, and outsize because these are uninitialized in this case in the CLI. The pointer in data will differ as that depends on Malloc. However, the values for r1-r3 are reversed between the two.

Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/szcompressor/SZ/issues/108#issuecomment-1578990787,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ACK3KSM44I5B4N6VCKNFIQ3XJ5ENLANCNFSM6AAAAAAY4KP3O4>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/szcompressor/SZ/issues/108#issuecomment-1586007815,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/AB2AP5EVICNCLB6OMOP3Y4DXKVCGJANCNFSM6AAAAAAY4KP3O4>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/szcompressor/SZ/issues/108#issuecomment-1586123347, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACK3KSMCGAUYAP2JNO7CYZDXKWTX3ANCNFSM6AAAAAAY4KP3O4 . You are receiving this because you were mentioned.Message ID: @.***>

disheng222 commented 1 year ago

Moreover, according to my test, -3 52 25 2920 gives a slightly higher compression ratio than -3 2920 25 52. So, probably, 2920 is supposed to be the highest (slowest-changing) dimension in the dataset.

Best, Sheng

On Sun, Jun 11, 2023 at 8:57 AM Sheng Di @.***> wrote:

I generated the binary data file and tested the executable command (sz), which looks fine in both cases with different dimension orders, as shown below: @. Orioltinto]$ sz -f -x -i test.dat -M ABS -A 0.0625 -3 2920 25 52 -s test.dat.sz -a Min=221, Max=317.399993896484375, range=96.399993896484375 Max absolute error = 0.0625000000 Max relative error = 0.000648 Max pw relative error = 0.000276 PSNR = 68.568258, NRMSE= 0.00037289547905824369 normError = 70.036950, normErr_norm = 0.000128 acEff=0.999998 compressionRatio=4.546184 decompression time = 0.104690 seconds. decompressed data file: test.dat.sz.out @. Orioltinto]$ sz -f -z -i test.dat -M ABS -A 0.0625 -3 52 25 2920 compression time = 0.136057 compressed data file: test.dat.sz @. Orioltinto]$ sz -f -x -i test.dat -M ABS -A 0.0625 -3 52 25 2920 -s test.dat.sz -a Min=221, Max=317.399993896484375, range=96.399993896484375 Max absolute error = 0.0625000000 Max relative error = 0.000648 Max pw relative error = 0.000277 PSNR = 68.568176, NRMSE= 0.00037289899094057274171 normError = 70.037610, normErr_norm = 0.000128 acEff=0.999998 compressionRatio=4.709267 decompression time = 0.107256 seconds. decompressed data file: test.dat.sz.out @. Orioltinto]$

This means the compression code itself has no issue.

I'll check it in more detail later.

Best, Sheng

On Sun, Jun 11, 2023 at 6:29 AM Robert Underwood @.***> wrote:

Sheng,

The python code provided downloads it. No, the dimensions being inverted is not the the issue. I can run the dataset with both LibPressio and the CLI in both orders just fine.

Robert

On Sun, Jun 11, 2023 at 00:26 Sheng Di @.***> wrote:

Hi Robert, Do you mean the issue is because of the order of data dimensions? If the dimension is messed up, the compression ratio will be definitely 'wrong'. It seems that the user is using a what's called air_temperature dataset. Where did you find this dataset for the test on your side? Thanks.

Best, Sheng

On Tue, Jun 6, 2023 at 10:31 AM Robert Underwood @.***> wrote:

All,

This shouldn't matter with respect to obeying the error bound, but these do generate slightly different calls to SZ_compress_args:

the HDF5 filter

dataType = 0
data = 0x7fffb8eca010
outSize = 0x7fffffffb3e8
errBoundMode = 0
absErrBound = 0.0625
relBoundRatio = 0
pwrBoundRatio = 0
r5 = 0
r4 = 0
r3 = 2920
r2 = 25
r1 = 53

the command line

dataType = 0
data = 0x7ffff6bf9010
outSize = 0x7fffffffc4d0
errBoundMode = 0
absErrBound = 0.0625
relBoundRatio = 0.0001
pwrBoundRatio = 0.001
r5 = 0
r4 = 0
r3 = 53
r2 = 25
r1 = 2920

We should be able to ignore the differences in relBoundRatio, pwrBoundRatio, and outsize because these are uninitialized in this case in the CLI. The pointer in data will differ as that depends on Malloc. However, the values for r1-r3 are reversed between the two.

Message ID: @.***>

— Reply to this email directly, view it on GitHub < https://github.com/szcompressor/SZ/issues/108#issuecomment-1578990787>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ACK3KSM44I5B4N6VCKNFIQ3XJ5ENLANCNFSM6AAAAAAY4KP3O4>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/szcompressor/SZ/issues/108#issuecomment-1586007815,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/AB2AP5EVICNCLB6OMOP3Y4DXKVCGJANCNFSM6AAAAAAY4KP3O4>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/szcompressor/SZ/issues/108#issuecomment-1586123347, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACK3KSMCGAUYAP2JNO7CYZDXKWTX3ANCNFSM6AAAAAAY4KP3O4 . You are receiving this because you were mentioned.Message ID: @.***>

disheng222 commented 1 year ago

The dimension of that test dataset is 2920x25x53 actually. So, the binary compression test result should be: [sdi@localhost Orioltinto]$ sz -f -z -i test.dat -M ABS -A 0.0625 -3 53 25 2920 compression time = 0.138519 compressed data file: test.dat.sz [sdi@localhost Orioltinto]$ sz -f -x -i test.dat -M ABS -A 0.0625 -3 53 25 2920 -s test.dat.sz -a Min=221, Max=317.399993896484375, range=96.399993896484375 Max absolute error = 0.0625000000 Max relative error = 0.000648 Max pw relative error = 0.000274 PSNR = 68.565050, NRMSE= 0.0003730332270574076973 normError = 70.733295, normErr_norm = 0.000128 acEff=0.999998 compressionRatio=6.790179 decompression time = 0.093689 seconds. decompressed data file: test.dat.sz.out

disheng222 commented 1 year ago

Here are some updates about my new tests. I converted the test.dat file to the hdf5 file format (called test.dat.h5), and then I use h5repack command with the absolute error bound 0.0625 to test the h5z-filter. The result shows that the error can be bounded well. I ran 'print_h5repack_args -M ABS -A 0.0625' to generate the parameters of hdf5 filter for SZ. Details are shown below. This test confirms that h5z filter has no bugs, compression should be fine. The only possible issue is the python wrapper or so.

=================== [sdi@localhost test]$ print_h5repack_args -M ABS -A 0.0625 -f UD=32017,0,9,0,1068498944,0,0,0,0,0,0,0 [sdi@localhost Orioltinto]$ ls test.dat test.dat.h5 test.dat.h5.dat test.dat.sz test.dat.sz.out test.py [sdi@localhost Orioltinto]$ h5repack -f UD=32017,0,9,0,1068498944,0,0,0,0,0,0,0 test.dat.h5 test.data.sz.h5 [sdi@localhost Orioltinto]$ ls -al total 66036 drwxrwxr-x 2 sdi sdi 4096 Jun 11 14:52 . drwxrwxr-x. 10 sdi sdi 4096 Jun 11 08:37 .. -rw-rw-r-- 1 sdi sdi 15476000 Jun 11 08:52 test.dat -rw-rw-r-- 1 sdi sdi 2283758 Jun 11 14:52 test.data.sz.h5 -rw-rw-r-- 1 sdi sdi 15478048 Jun 11 14:46 test.dat.h5 -rw-rw-r-- 1 sdi sdi 15476000 Jun 11 14:47 test.dat.h5.dat -rw-rw-r-- 1 sdi sdi 3406365 Jun 11 14:48 test.dat.sz -rw-rw-r-- 1 sdi sdi 15476000 Jun 11 14:48 test.dat.sz.out -rw-rw-r-- 1 sdi sdi 1439 Jun 11 09:01 test.py [sdi@localhost Orioltinto]$ h5repack -f NONE test.data.sz.h5 test.dat.sz.out.h5 [sdi@localhost Orioltinto]$ ls -al total 81156 drwxrwxr-x 2 sdi sdi 4096 Jun 11 14:53 . drwxrwxr-x. 10 sdi sdi 4096 Jun 11 08:37 .. -rw-rw-r-- 1 sdi sdi 15476000 Jun 11 08:52 test.dat -rw-rw-r-- 1 sdi sdi 2283758 Jun 11 14:52 test.data.sz.h5 -rw-rw-r-- 1 sdi sdi 15478048 Jun 11 14:46 test.dat.h5 -rw-rw-r-- 1 sdi sdi 15476000 Jun 11 14:47 test.dat.h5.dat -rw-rw-r-- 1 sdi sdi 3406365 Jun 11 14:48 test.dat.sz -rw-rw-r-- 1 sdi sdi 15476000 Jun 11 14:48 test.dat.sz.out -rw-rw-r-- 1 sdi sdi 15480536 Jun 11 14:53 test.dat.sz.out.h5 -rw-rw-r-- 1 sdi sdi 1439 Jun 11 09:01 test.py [sdi@localhost Orioltinto]$ h5repack -f NONE test.data.sz.h5 test.dat.sz.out.h5 [sdi@localhost Orioltinto]$ h5dump -d /temperature -b LE -o test.dat.sz.out.h5.dat test.dat.sz.out.h5 HDF5 "test.dat.sz.out.h5" { DATASET "/temperature" { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE { ( 2920, 25, 53 ) / ( 2920, 25, 53 ) } DATA { } } } [sdi@localhost Orioltinto]$ ls test.dat test.dat.h5 test.dat.sz test.dat.sz.out.h5 test.py test.data.sz.h5 test.dat.h5.dat test.dat.sz.out test.dat.sz.out.h5.dat [sdi@localhost Orioltinto]$ compareData -f test.dat test.dat.sz.out.h5.dat (the compareData command can be found in github qcat package) This is little-endian system. reading data from test.dat Min = 221, Max = 317.399993896484375, range = 96.399993896484375 Max absolute error = 0.0625000000 Max relative error = 0.000648 Max pw relative error = 0.000274 PSNR = 68.565050, NRMSE = 0.0003730332270574076973 normErr = 70.733295, normErr_norm = 0.000128 pearson coeff = 0.999998

disheng222 commented 1 year ago

@orioltinto Another possible reason is that you are using the embedded SZ filter in hdf5 package. I remember one year ago the hdf5 developer said the SZ filter will be included in the filter list by default in new releases. Which hdf5 version did you use? Did you install h5z filter by yourself or did you use the embedded SZ filter in the hdf5 package?

orioltinto commented 1 year ago

Hi @disheng222 @robertu94 , thanks for looking into the issue. Sorry, maybe it was not clear: this error happens with the embedded version of SZ in the hdf5plugin package. The version is not automatically updated and the last update from their side was 7 months ago. However it looks like it was after the last release.

You mentioned that the only possible issue is the python wrapper, however with gdb it looked to me that the dimensions that the filter gets look correct.

disheng222 commented 1 year ago

I tested the latest version of H5Z filter (HDF5 plugin mode) which looks good to me, as shown in last email (please see the results with h5repack). Could you tell me which HDF5 version you are using and how you installed the H5Zfilter? Thanks

Best Sheng

Oriol Tintó Prims @.***>于2023年6月12日 周一上午3:07写道:

Hi @disheng222 https://github.com/disheng222 @robertu94 https://github.com/robertu94 , thanks for looking into the issue. Sorry, maybe it was not clear: this error happens with the embedded version of SZ in the hdf5plugin package. The version is not automatically updated and the last update from their side was 7 months ago. However it looks like it was after the last release.

You mentioned that the only possible issue is the python wrapper, however with gdb it looked to me that the dimensions that the filter gets look correct.

— Reply to this email directly, view it on GitHub https://github.com/szcompressor/SZ/issues/108#issuecomment-1586802243, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACK3KSOOQ45B2RXK7JCG4ULXK3E23ANCNFSM6AAAAAAY4KP3O4 . You are receiving this because you were mentioned.Message ID: @.***>

-- Sent from my iPhone

orioltinto commented 1 year ago

Hi @disheng222, I install it through the python package installer:

pip install hdf5plugin

In my first message I've put the Dockerfile with all the requirements I used to reproduce the error I tried it with the image python3.11 but the same thing happens with older versions of python.

In this case the versions are: -h5py: 3.8.0 -hdf5: 1.12.2

So the filter is actually reversing the order of the dimensions but anyway it shouldn't lead to the error threshold not being respected, is that right? If the order of the dimensions is not the issue and the other arguments look alike, might it be that somehow the data is already corrupted before arriving to the filter?

orioltinto commented 1 year ago

Hi again, I made a simpler example in which the constrain is not respected.

Here I import hdf5plugin only to register the filters and make them available, but I didn't use the interface for this example:

import struct
import tempfile

import h5py
import hdf5plugin  # noqa
import numpy as np

# Some parameters
SHAPE = (1000, 25, 25)
TOLERANCE = 0.01

def pack_float64(error: float) -> tuple:
    packed = struct.pack('>d', error)  # Pack as big-endian IEEE 754 double
    high = struct.unpack('>I', packed[0:4])[0]  # Unpack most-significant bits as unsigned int
    low = struct.unpack('>I', packed[4:8])[0]  # Unpack least-significant bits as unsigned int
    return high, low

# Generate random data
np.random.seed(0)
data = np.random.random(size=SHAPE).astype(np.float32)

high, low = pack_float64(TOLERANCE)

encoding = {
    'compression': 32017,
    'compression_opts': (0, high, low, 0, 0, 0, 0, 0, 0),
    'chunks': SHAPE,
}

with tempfile.NamedTemporaryFile() as tmp_file:
    # Create compressed file
    with h5py.File(tmp_file.name, 'w') as f:
        f.create_dataset('var', data=data, **encoding)

    # Open compressed file
    with h5py.File(tmp_file.name, 'r') as f:
        recovered_data = f["var"][:]

# Check that the data fulfills the constrain
if not np.allclose(data, recovered_data, atol=TOLERANCE):
    max_diff = np.max(np.abs(recovered_data - data))
    raise AssertionError(f"Condition not fulfilled for {TOLERANCE=} -> {max_diff=}")

The example can be reproduced in a python container with only h5py and hdf5plugin installed. My dockerfile looks like:

FROM python:3.11
RUN pip install h5py hdf5plugin
COPY code.py .
CMD python3 code.py

By the way, the error doesn't happen with float64 data.

Some additional information:

Output of h5dump -H -p with the SZ filter. ``` HDF5 "/tmp/user/24207/tmp0hgxbikh" { GROUP "/" { DATASET "var" { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE { ( 1000, 25, 25 ) / ( 1000, 25, 25 ) } STORAGE_LAYOUT { CHUNKED ( 1000, 25, 25 ) SIZE 585230 (4.272:1 COMPRESSION) } FILTERS { USER_DEFINED_FILTER { FILTER_ID 32017 COMMENT SZ compressor/decompressor for floating-point data. PARAMS { 3 0 25 25 1000 0 1065646817 1202590843 0 0 0 0 0 0 } } } FILLVALUE { FILL_TIME H5D_FILL_TIME_ALLOC VALUE H5D_FILL_VALUE_DEFAULT } ALLOCATION_TIME { H5D_ALLOC_TIME_INCR } } } } ```
Output of h5dump -H -p with the SZ3 filter. ``` HDF5 "/tmp/user/24207/tmpaua5_9w4" { GROUP "/" { DATASET "var" { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE { ( 1000, 25, 25 ) / ( 1000, 25, 25 ) } STORAGE_LAYOUT { CHUNKED ( 1000, 25, 25 ) SIZE 490933 (5.092:1 COMPRESSION) } FILTERS { USER_DEFINED_FILTER { FILTER_ID 32024 COMMENT SZ3 compressor/decompressor for floating-point data. PARAMS { 3 0 25 25 1000 0 1065646817 1202590843 0 0 0 0 0 0 } } } FILLVALUE { FILL_TIME H5D_FILL_TIME_ALLOC VALUE H5D_FILL_VALUE_DEFAULT } ALLOCATION_TIME { H5D_ALLOC_TIME_INCR } } } } ```
disheng222 commented 1 year ago

Thanks. I’ll check it later.

Oriol Tintó Prims @.***>于2023年6月15日 周四上午6:42写道:

Hi again, I made a simpler example in which the constrain is not respected.

Here I import hdf5plugin only to register the filters and make them available, but I didn't use the interface for this example:

import structimport tempfile import h5pyimport hdf5plugin # noqaimport numpy as np

Some parametersSHAPE = (1000, 25, 25)TOLERANCE = 0.01

def pack_float64(error: float) -> tuple: packed = struct.pack('>d', error) # Pack as big-endian IEEE 754 double high = struct.unpack('>I', packed[0:4])[0] # Unpack most-significant bits as unsigned int low = struct.unpack('>I', packed[4:8])[0] # Unpack least-significant bits as unsigned int return high, low

Generate random datanp.random.seed(0)data = np.random.random(size=SHAPE).astype(np.float32)

high, low = pack_float64(TOLERANCE) encoding = { 'compression': 32017, 'compression_opts': (0, high, low, 0, 0, 0, 0, 0, 0), 'chunks': SHAPE, } with tempfile.NamedTemporaryFile() as tmp_file:

Create compressed file

with h5py.File(tmp_file.name, 'w') as f:
    f.create_dataset('var', data=data, **encoding)

# Open compressed file
with h5py.File(tmp_file.name, 'r') as f:
    recovered_data = f["var"][:]

Check that the data fulfills the constrainif not np.allclose(data, recovered_data, atol=TOLERANCE):

max_diff = np.max(np.abs(recovered_data - data))
raise AssertionError(f"Condition not fulfilled for {TOLERANCE=} -> {max_diff=}")

The example can be reproduced in a python container with only h5py and hdf5plugin installed. My dockerfile looks like:

FROM python:3.11 RUN pip install h5py hdf5plugin COPY code.py . CMD python3 code.py

By the way, the error doesn't happen with float64 data.

Some additional information: Output of h5dump -H -p with the SZ filter.

HDF5 "/tmp/user/24207/tmp0hgxbikh" { GROUP "/" { DATASET "var" { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE { ( 1000, 25, 25 ) / ( 1000, 25, 25 ) } STORAGE_LAYOUT { CHUNKED ( 1000, 25, 25 ) SIZE 585230 (4.272:1 COMPRESSION) } FILTERS { USER_DEFINED_FILTER { FILTER_ID 32017 COMMENT SZ compressor/decompressor for floating-point data. PARAMS { 3 0 25 25 1000 0 1065646817 1202590843 0 0 0 0 0 0 } } } FILLVALUE { FILL_TIME H5D_FILL_TIME_ALLOC VALUE H5D_FILL_VALUE_DEFAULT } ALLOCATION_TIME { H5D_ALLOC_TIME_INCR } } } }

Output of h5dump -H -p with the SZ3 filter.

HDF5 "/tmp/user/24207/tmpaua5_9w4" { GROUP "/" { DATASET "var" { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE { ( 1000, 25, 25 ) / ( 1000, 25, 25 ) } STORAGE_LAYOUT { CHUNKED ( 1000, 25, 25 ) SIZE 490933 (5.092:1 COMPRESSION) } FILTERS { USER_DEFINED_FILTER { FILTER_ID 32024 COMMENT SZ3 compressor/decompressor for floating-point data. PARAMS { 3 0 25 25 1000 0 1065646817 1202590843 0 0 0 0 0 0 } } } FILLVALUE { FILL_TIME H5D_FILL_TIME_ALLOC VALUE H5D_FILL_VALUE_DEFAULT } ALLOCATION_TIME { H5D_ALLOC_TIME_INCR } } } }

— Reply to this email directly, view it on GitHub https://github.com/szcompressor/SZ/issues/108#issuecomment-1592883632, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACK3KSJN77J5TC4DFONGLKLXLLYJJANCNFSM6AAAAAAY4KP3O4 . You are receiving this because you were mentioned.Message ID: @.***>

-- Sent from my iPhone

orioltinto commented 1 year ago

I could verify that the error does not occur if I build the SZ filter but it occurs with the version shipped with hdf5plugins. I guess that the problem must be related to how the filter is built there.

The script (code.py) that I used to test this is the one I posted before, and a Dockerfile to reproduce it:

FROM ubuntu:rolling
RUN export DEBIAN_FRONTEND=noninteractive \
    && apt update \
    && apt install -yq vim make cmake wget git python3 python3-pip python3-venv \
    && apt install -yq swig gcc gfortran pkg-config libzstd-dev \
    && apt install -yq libhdf5-dev hdf5-tools
RUN git clone https://github.com/szcompressor/SZ.git --depth 1
RUN cd SZ; mkdir build ; cd build; cmake .. -DBUILD_HDF5_FILTER:BOOL=ON ; make ; make install
RUN pip install h5py  --no-binary h5py --break-system-packages
RUN pip install hdf5plugin --break-system-packages
COPY code.py .
CMD HDF5_PLUGIN_PATH=/SZ/build/hdf5-filter/H5Z-SZ/ python3 code.py ; \
    echo "_______________________" ; \
    python3 code.py

I will open an issue at the hdf5plugin github page.

t20100 commented 1 year ago

This indeed looks to come from compilation flags used in hdf5plugin. PR https://github.com/silx-kit/hdf5plugin/issues/268 should fix this.

orioltinto commented 1 year ago

It looks like the flag '-ffast-math' was the cause of the problem. Thanks @t20100 for fixing it.

Thanks for your feedback as well @disheng222 @robertu94 .

robertu94 commented 1 year ago

Thank you @t20100 and @orioltinto I wouldn't have thought of -ffast-math because I don't use it very often at all. Glad this issue is resolved.

epasveer commented 1 year ago

I found this task an interesting one to follow.

Found this tidbit on stackoverflow about -ffast-math. Not sure if it explains how it comes into play with the task. But it's an interesting read too.

https://stackoverflow.com/questions/7420665/what-does-gccs-ffast-math-actually-do

disheng222 commented 1 year ago

Thanks all for finding this issue and resolving it!

Robert Underwood @.***>于2023年6月16日 周五上午10:28写道:

Thank you @t20100 https://github.com/t20100 and @orioltinto https://github.com/orioltinto I wouldn't have thought of -ffast-math because I don't use it very often at all. Glad this issue is resolved.

— Reply to this email directly, view it on GitHub https://github.com/szcompressor/SZ/issues/108#issuecomment-1594783720, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACK3KSNWALLVTL47E5IQG3LXLRUQZANCNFSM6AAAAAAY4KP3O4 . You are receiving this because you were mentioned.Message ID: @.***>

-- Sent from my iPhone