kiyo-masui / bitshuffle

Filter for improving compression of typed binary data.
Other
215 stars 76 forks source link

Usage Example #32

Closed simongregorebner closed 8 years ago

simongregorebner commented 8 years ago

Added a usage example to the readme as I found it difficult to figure out how to correctly use the library form python.

kiyo-masui commented 8 years ago

Great, this is really useful. A few comments:

Can you change the heading from "Example" to "h5py Example", or something of that ilk. While most people seem to use the h5py interface, in principle that is only part of the library.

Can you add the line print h5py.__version__ # '2.X.Y' (filling in X and Y). I'm not sure when h5py started supporting arbitrary filters in the high level interface, but it is recent and I have been using the low-level interface (see all the annoying code I had to write in bitshuffule/h5.pyx, which I guess is obsolete now).

I would stylize the create_dataset line as follows, as the long line is difficult to read:

dataset = filehandle.create_dataset(
    "data",
    (100, 100, 100),
    maxshape=(None, 100, 100),
    compression=32008,
    compression_opts=(block_size, bitshuffle.h5.H5_COMPRESS_LZ4),
    chunks=(1,100,100),
    dtype='float32',
    )

If giving a minimal example, why specify maxshape?

Why only fill the first chunk with data? Might as well use array = numpy.random.rand(100, 100, 100) and dataset[:] = array

The h5py documentation generally uses the variable name f for File objects. It is less verbose, but note that an h5py.File is not really a file handle.

Note that the 'official' recommendation is to set block_size = 0 and let Bitshuffle choose its value (which in your example comes out to be 2048 anyway). I would follow that, and add what the special value 0 means in a comment.

Please change 32008 to bitshuffle.h5.H5FILTER.

If you have no objections, either you or I could make the proposed changes.

simongregorebner commented 8 years ago

I updated the parts of the code you mentioned. Feel free to do further changes as you like...