tomasstolker / species

Toolkit for atmospheric characterization of directly imaged exoplanets
https://species.readthedocs.io
MIT License
22 stars 10 forks source link

[bug] Writing to hdf5 file fails with run_multinest and run_ultranest. #63

Closed nenasedk closed 1 year ago

nenasedk commented 1 year ago

species == 0.5.5 h5py==3.8.0 python~=3.10.10 and similar

When running a grid fit as in the tutorial, the fit.run_ultranest and fit.run_multinest functions fail to output correctly to the hdf5 file, returning the error TypeError: Object dtype dtype('O') has no native HDF5 equivalent. This error persists across the use of multiple model grids and companion choices.

The printed outputs are reasonable, providing median and max likelihood parameters, and some samples are clearly being written as the database.get_mcmc_spectra and species.plot_posterior work fine. However, I think this initial output issue might cause problems with the species.plot_spectrum function, which fails to fully complete the plot, and instead gives the error ValueError: zero-size array to reduction operation minimum which has no identity.

tomasstolker commented 1 year ago

Thanks for opening this issue, @nenasedk!

Add which line in (presumably) fit_model.py does the error occur?

nenasedk commented 1 year ago

I'd need to rerun the retrieval to get the full traceback, but as far as I can tell it's occuring here:


   2097 if mpi_rank == 0:
   2098     # Writing the samples to the database is only
   2099     # possible when using a single process
   2100     species_db = database.Database()
-> 2102     species_db.add_samples(
   2103         sampler="ultranest",
   2104         samples=samples,
   2105         ln_prob=ln_prob,```
tomasstolker commented 1 year ago

Thanks! I just tested the function and it runs fine for me so might be something specific with your OS or dependencies.

Please post the full traceback if you get the chance then I can try figure out what the issue is.

nenasedk commented 1 year ago

Here's the full traceback. This one is for a multinest retrieval rather than an ultranest. This is in a clean conda environment, installing the dependancies from the requirements.txt file. I'm using MacOS 10.14.6.


TypeError                                 Traceback (most recent call last)
Cell In[9], line 1
----> 1 fit.run_multinest(tag=f'{planet_name}_{current_model}',
      2                   n_live_points=200,
      3                   output='multinest[/](https://file+.vscode-resource.vscode-cdn.net/)',
      4                   prior={'mass': (7.8, 1.2)})

File [~/anaconda3/envs/species/lib/python3.10/site-packages/typeguard/__init__.py:1033](https://file+.vscode-resource.vscode-cdn.net/Users/nasedkin/Documents/Paper2_HR8799_Notebooks/~/anaconda3/envs/species/lib/python3.10/site-packages/typeguard/__init__.py:1033), in typechecked..wrapper(*args, **kwargs)
   1031 memo = _CallMemo(python_func, _localns, args=args, kwargs=kwargs)
   1032 check_argument_types(memo)
-> 1033 retval = func(*args, **kwargs)
   1034 try:
   1035     check_return_type(retval, memo)

File [~/python-packages/species/species/analysis/fit_model.py:1875](https://file+.vscode-resource.vscode-cdn.net/Users/nasedkin/Documents/Paper2_HR8799_Notebooks/~/python-packages/species/species/analysis/fit_model.py:1875), in FitModel.run_multinest(self, tag, n_live_points, output, prior)
   1871 if mpi_rank == 0:
   1872     # Writing the samples to the database is only possible when using a single process
   1873     species_db = database.Database()
-> 1875     species_db.add_samples(
   1876         sampler="multinest",
   1877         samples=samples,
   1878         ln_prob=ln_prob,
   1879         tag=tag,
   1880         modelpar=self.modelpar,
   1881         spec_labels=spec_labels,
   1882         attr_dict=attr_dict,
   1883     )

File [~/anaconda3/envs/species/lib/python3.10/site-packages/typeguard/__init__.py:1033](https://file+.vscode-resource.vscode-cdn.net/Users/nasedkin/Documents/Paper2_HR8799_Notebooks/~/anaconda3/envs/species/lib/python3.10/site-packages/typeguard/__init__.py:1033), in typechecked..wrapper(*args, **kwargs)
   1031 memo = _CallMemo(python_func, _localns, args=args, kwargs=kwargs)
   1032 check_argument_types(memo)
-> 1033 retval = func(*args, **kwargs)
   1034 try:
   1035     check_return_type(retval, memo)

File [~/python-packages/species/species/data/database.py:1870](https://file+.vscode-resource.vscode-cdn.net/Users/nasedkin/Documents/Paper2_HR8799_Notebooks/~/python-packages/species/species/data/database.py:1870), in Database.add_samples(self, sampler, samples, ln_prob, tag, modelpar, ln_evidence, mean_accept, spectrum, parallax, spec_labels, attr_dict)
   1867     dset.attrs[f"autocorrelation{i}"] = float(auto_corr)
   1869 for key, value in attr_dict.items():
-> 1870     dset.attrs[key] = value
   1872 h5_file.close()

File h5py[/_objects.pyx:54](https://file+.vscode-resource.vscode-cdn.net/_objects.pyx:54), in h5py._objects.with_phil.wrapper()

File h5py[/_objects.pyx:55](https://file+.vscode-resource.vscode-cdn.net/_objects.pyx:55), in h5py._objects.with_phil.wrapper()

File [~/anaconda3/envs/species/lib/python3.10/site-packages/h5py/_hl/attrs.py:104](https://file+.vscode-resource.vscode-cdn.net/Users/nasedkin/Documents/Paper2_HR8799_Notebooks/~/anaconda3/envs/species/lib/python3.10/site-packages/h5py/_hl/attrs.py:104), in AttributeManager.__setitem__(self, name, value)
     96 @with_phil
     97 def __setitem__(self, name, value):
     98     """ Set a new attribute, overwriting any existing attribute.
     99 
    100     The type and shape of the attribute are determined from the data.  To
    101     use a specific type or shape, or to preserve the type of an attribute,
    102     use the methods create() and modify().
    103     """
--> 104     self.create(name, data=value)

File [~/anaconda3/envs/species/lib/python3.10/site-packages/h5py/_hl/attrs.py:181](https://file+.vscode-resource.vscode-cdn.net/Users/nasedkin/Documents/Paper2_HR8799_Notebooks/~/anaconda3/envs/species/lib/python3.10/site-packages/h5py/_hl/attrs.py:181), in AttributeManager.create(self, name, data, shape, dtype)
    179 # Make HDF5 datatype and dataspace for the H5A calls
    180 if use_htype is None:
--> 181     htype = h5t.py_create(original_dtype, logical=True)
    182     htype2 = h5t.py_create(original_dtype)  # Must be bit-for-bit representation rather than logical
    183 else:

File h5py[/h5t.pyx:1664](https://file+.vscode-resource.vscode-cdn.net/h5t.pyx:1664), in h5py.h5t.py_create()

File h5py[/h5t.pyx:1688](https://file+.vscode-resource.vscode-cdn.net/h5t.pyx:1688), in h5py.h5t.py_create()

File h5py[/h5t.pyx:1748](https://file+.vscode-resource.vscode-cdn.net/h5t.pyx:1748), in h5py.h5t.py_create()

TypeError: Object dtype dtype('O') has no native HDF5 equivalent```
tomasstolker commented 1 year ago

Would you mind running once more and adding this line: print(key, value, type(key), type(value)) within the for loop at line 1869 of database.py (here I think: ~/python-packages/species/species/data/database.py)

nenasedk commented 1 year ago

The output from printing that is


spec_name petitcode-hot-cloudy <class 'str'> <class 'str'>
ln_evidence (1283.7930494417099, 0.11618213849420066) <class 'str'> <class 'tuple'>
parallax 24.462 <class 'str'> <class 'numpy.float64'>
ext_filter None <class 'str'> <class 'NoneType'>
tomasstolker commented 1 year ago

Ah you are using the latest version from Github! Excellent!

The bug is related to the new ext_filter parameter. An attribute in the HDF5 database can not be set to None.

I think it should be fixed now with commit d54160ebfb90ca800fc9ebb982894dcc11f8a101.

If you pip install from Github again then it should hopefully work!

nenasedk commented 1 year ago

Great, it works now, thanks a lot! Turns out the downstream bug in plot_spectrum has to do with the choice of filters I was using, so it's unrelated to this issue.