BEAST-Fitting / beast

Bayesian Extinction And Stellar Tool
http://beast.readthedocs.io
23 stars 35 forks source link

HDF5 support via h5py instead of pytables #9

Open karllark opened 7 years ago

karllark commented 7 years ago

h5py support is easier to maintain and does not require the external hdf5 system library. Having to get this system library installed is one of the barriers to easy use of the BEAST. In addition, h5py may be more pythonic.

mfouesneau commented 7 years ago

requires HDF5 1.8.4 or newer, shared library version with development headers (libhdf5-dev or similar)

same as pytables. But astropy requires h5py.

drvdputt commented 6 years ago

I have been using h5py for a bit, and found that it could not read the filters string from the grid attributes. Probably an instance of this issue https://github.com/h5py/h5py/issues/624

Example:

f = h5py.File('beast_example_phat_seds.grid.hd5')
g = f['grid']
g.attrs['filters']

results in

OSError                                   Traceback (most recent call last)
<ipython-input-10-0952a280e528> in <module>()
----> 1 g.attrs['filters']

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

~/Software/miniconda3/lib/python3.6/site-packages/h5py/_hl/attrs.py in __getitem__(self, name)
     79 
     80         arr = numpy.ndarray(shape, dtype=dtype, order='C')
---> 81         attr.read(arr, mtype=htype)
     82 
     83         if len(arr.shape) == 0:

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5a.pyx in h5py.h5a.AttrID.read()

h5py/_proxy.pyx in h5py._proxy.attr_rw()

OSError: Unable to read attribute (no appropriate function for conversion path)

Pytables does not have this problem. Of course, this example file was written by pytables, so this might be an incompatibility problem.

karllark commented 6 years ago

Yep. If/when we move away from pytables, we will need to make sure all our existing hd5 files can be read by h5py. I still feel making the switch would beneficial in having less code we need to maintain ourselves, but it is non trivial.

lea-hagen commented 4 years ago

I've come across another possible reason to move away from pytables: it doesn't seem to handle closing files very well. I haven't investigated very far, but as we do production runs with large files, it could be causing extra memory usage.

In [1]: import tables

In [2]: for i in range(5):
   ...:     x = tables.open_file('14675_LMC-5665ne-12232_beast_noisemodel_bin1.g
   ...: rid.hd5')
   ...:

In [3]: exit
Closing remaining open files:
14675_LMC-5665ne-12232_beast_noisemodel_bin1.grid.hd5...done
14675_LMC-5665ne-12232_beast_noisemodel_bin1.grid.hd5...done
14675_LMC-5665ne-12232_beast_noisemodel_bin1.grid.hd5...done
14675_LMC-5665ne-12232_beast_noisemodel_bin1.grid.hd5...done
14675_LMC-5665ne-12232_beast_noisemodel_bin1.grid.hd5...done
karllark commented 4 years ago

Yep. See #64. I've investigated and not managed to figure out how to close this file manually. I like the idea that this is another reason to move away from pytables. :-)