ACCarnall / bagpipes

Bagpipes is a state of the art code for generating realistic model galaxy spectra and fitting these to spectroscopic and photometric observations. Users should install with pip, not by cloning the repository.
http://bagpipes.readthedocs.io
GNU General Public License v3.0
71 stars 37 forks source link

data fitting issue #2 #40

Open elNava opened 1 year ago

elNava commented 1 year ago

Hi,

I am having the same error detailed in issue #9. In that case they mention that the filter curve was the problem, but I am using NIRCam filters taken directly from SVO. I am following the steps in the notebooks, this is my input:

[[0.0, 9.9e+100],
 [33.539945030185606, 1.3178399224148853],
 [77.90387252133127, 1.41562451327364],
 [151.94228944830488, 1.3753196635696123],
 [330.4698020919672, 1.5488078001448062],
 [578.144563222296, 1.6778350311376689],
 [751.0610971956924, 2.94762245193689],
 [892.349345751542, 2.2701580967575326]]

And the error:

Bagpipes: fitting object 1420

Completed in 0.1 seconds.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-eaec54d81345> in <module>
      1 fit = pipes.fit(galaxy, fit_instructions)
----> 2 fit.fit()

~/.local/lib/python3.8/site-packages/bagpipes/fitting/fit.py in fit(self, verbose, n_live, use_MPI)
    134 
    135             # Load MultiNest outputs and save basic quantities to file.
--> 136             samples2d = np.loadtxt(self.fname + "post_equal_weights.dat")
    137             lnz_line = open(self.fname + "stats.dat").readline().split()
    138 

/opt/anaconda3/lib/python3.8/site-packages/numpy/lib/npyio.py in loadtxt(fname, dtype, comments, delimiter, converters, skiprows, usecols, unpack, ndmin, encoding, max_rows)
   1137         # converting the data
   1138         X = None
-> 1139         for x in read_data(_loadtxt_chunksize):
   1140             if X is None:
   1141                 X = np.array(x, dtype)

/opt/anaconda3/lib/python3.8/site-packages/numpy/lib/npyio.py in read_data(chunk_size)
   1065 
   1066             # Convert each value according to its column and store
-> 1067             items = [conv(val) for (conv, val) in zip(converters, vals)]
   1068 
   1069             # Then pack it according to the dtype's nesting

/opt/anaconda3/lib/python3.8/site-packages/numpy/lib/npyio.py in <listcomp>(.0)
   1065 
   1066             # Convert each value according to its column and store
-> 1067             items = [conv(val) for (conv, val) in zip(converters, vals)]
   1068 
   1069             # Then pack it according to the dtype's nesting

/opt/anaconda3/lib/python3.8/site-packages/numpy/lib/npyio.py in floatconv(x)
    761         if '0x' in x:
    762             return float.fromhex(x)
--> 763         return float(x)
    764 
    765     typ = dtype.type

ValueError: could not convert string to float: '-0.999000000000000022+100'

Which is pretty similar to the issue previous mentioned. I already checked that the SVO filters are in Angstrom, I do not know what else can I check.

Thanks in advance. Cheers, Benjamin

markliuchina commented 1 year ago

I used to be encountered with the same issue and I also noticed the issue #9 and then checked the filter files one by one. After that, nothing is going to be fixed. The question may also come from the fitting component parameters setting. I think it may be caused by the parameter range value. At that time, I was trying to imitate one of Adam's work (Carnal et al. 2020, published on MNRAS, staa1535). I set the double-power-law SFH component, following the instructions by Adam's paper (strictly input the parameter, without thinking too much about it, my faults), but the jupyter-notebook failed, in the way of your feedback. I checked the example file, which describes the "double-power-law SFH component" fitting and compared carefully these two codes line by line and I found where is the problem. image (Carnall et al. 2020) Please look at this screenshot, the stellar mass range is (1, 10^13) and we wish it in log space. As a result, you may set the input parameter in this way. (for log(1) ==0, log(10^13) == 13) dblplaw["massformed"] = (0., 13.) And when you run the fitting, the code fails. By testing, the lower limit of some parameter with a LOG prior should not be 0 Maybe a value closer to 0 but not equals to 0 is more plausible? so I set :

dblplaw["massformed"] = (0.00001, 13.)         #因为无法把这个质量下限的log_10取0,所以近似取0.00001,防止kernel崩溃。
dblplaw["metallicity"] = (0.1, 2.5)            #对数空间下的取值应该是-1到0.4之间,对应就是0.1到2.5
dblplaw["massformed_prior"] = "log_10"
dblplaw["metallicity_prior"] = "log_10"

And then the problem fixed! (But I am not sure if this will bias the fitting result) I suggest that you check the parameter range before fitting. Here is the component setup code I used before:

dblplaw = {}

dblplaw["tau"] = (0.1, 15.)
dblplaw["tau_prior"] = "uniform"
dblplaw["massformed"] = (1., 13.)
dblplaw["massformed_prior"] = "log_10"
dblplaw["metallicity"] = (0.1, 2.5)
dblplaw["metallicity_prior"] = "log_10"

dblplaw["alpha"] = (0.01, 1000.)
dblplaw["beta"] = (0.01, 1000.)
dblplaw["alpha_prior"] = "log_10"
dblplaw["beta_prior"] = "log_10"

dust = {}                           
dust["type"] = "Calzetti"
dust["delta"] = (-0.3, 0.3)
dust["delta_prior"] = "Gaussian"
dust["delta_prior_mu"] = 0.
dust["delta_prior_sigma"] = 0.1

dust["B"] = (0., 5.)
dust["B_prior"] = "uniform"

dust["Av"] = (0., 8.)
dust["Av_prior"] = "uniform"
dust["qpah"] = 2.
dust["umin"] = 1.
dust["gamma"] = 0.01

nebular = {}
nebular["logU"] = -3.

fit_info = {}
fit_info["redshift"] = (0., 10.)
fit_info["redshift_prior"] = "uniform"
fit_info["dblplaw"] = dblplaw 
fit_info["dust"] = dust
fit_info["nebular"] = nebular

You could just paste it in your notebook and see whether the problem is the the same as I describe above. Hope it help!

Mingfeng Liu

ACCarnall commented 1 year ago

This is a pretty generic error unfortunately, could be a problem with your data, filter curves or the model you're fitting. If you've checked your filters then maybe copy me your model here and I'll take a look.

elNava commented 1 year ago

Thank you very much @markliuchina for all your feedback on this. I was actually doing the same, trying to replicate the fitting model in Carnall et al. 2020. I tried changing the limit but it did not worked, I also tried to run the notebook with your model and the error persist.

Thanks @ACCarnall for your response too. With respect to the filters curves I am using NIRcam F090W, F115W, F150W, F200W, F277W, F356W, F410M, and F444W taken from the SVO webpage, you can find them here. Here is the model I am using

# from https://arxiv.org/abs/2001.11975
dblplaw = {}                             # double-power-law
dblplaw["massformed"] = (1., 13.)        # vary log_10(M*/M_solar) between 1 and 13
dblplaw["massformed_prior"] = "log10"

dblplaw["metallicity"] = (0.1, 2.5)      # vary Z between 0.1 and 2.5 Z_oldsolar
dblplaw["metallicity_prior"] = "log10"

dblplaw["alpha"] = (0.01, 1000)          # Falling slope index
dblplaw["beta"] = (0.01, 1000)           # Rising slope index
dblplaw["alpha_prior"] = "log_10"
dblplaw["beta_prior"] = "log_10"

dblplaw["tau"] = (0.1, 15.)              # Age of Universe at turnover: Gyr
dblplaw["tau_prior"] = "uniform"

dust = {}
dust["type"] = "Salim"                   # Attenuation law: "Calzetti", "Cardelli", "CF00" or "Salim"
dust["Av"] = (0., 8.)                    # Absolute attenuation in the V band: magnitudes
dust["Av_prior"] = "uniform"
dust["delta"] = (-0.3, 0.3)              # Deviation from Calzetti slope ("Salim" type only)
dust["delta_prior"] = "Gaussian"
dust["B"] = (0., 5)                      # 2175A bump strength ("Salim" type only)
dust["B_prior"] = "uniform"

fit_instructions = {}
fit_instructions["redshift"] = (0., 10.)  # Observed redshift
fit_instructions["redshift_prior"] = "uniform"
fit_instructions["sfh_comp"] = dblplaw    # Dict containing SFH info
fit_instructions["dust"] = dust

If nothing of the previous information is the responsible of the issue, the only thing left is the data. However I have checked it, paying attention to things like the units, but already tried changing that with no success.

ACCarnall commented 1 year ago

The mass formed already has a log prior by default, so you can delete the line

dblplaw["massformed_prior"] = "log10"

Hopefully this should fix things.

elNava commented 1 year ago

Hi @ACCarnall,

I tried by deleting that line and it did not worked :(

ACCarnall commented 1 year ago

Ok, nothing else looks wrong with your model, so the issue must be elsewhere. Have you tried deleting the pipes directory that stores outputs and running again? I'm not sure what further advice I can give here. For some reason the likelihood function is returning nan. I'd recommend going into bagpipes/fitting/fitted_model.py and inserting print statements into the likelihood function until you figure out where this is coming from, then work backwards to the original problem.

markliuchina commented 1 year ago

@elNava Hi, Benjamin. I think the problem is still the model parameter value. There is another possibility. Your feedback is: ValueError: could not convert string to float: '-0.999000000000000022+100' It may indicate an error in value format. If your filter files and photometry data are all prepared well, please check your input values in the model. Cause I noticed some of your parameter is set as this: image For example, you set your parameter like this:

dblplaw["alpha"] = (0.01, 1000)          # Falling slope index
dblplaw["beta"] = (0.01, 1000)           # Rising slope index

Could you please have a change and add a "." after the "1000":

dblplaw["alpha"] = (0.01, 1000.)          # Falling slope index
dblplaw["beta"] = (0.01, 1000.)           # Rising slope index

In this way, the "int" value "1000" can be converted to "float" value. You can add "." after all your integer values in your model in order to make them "float". As you can see in Adam's example: image Hope it fix things.

Best.

Mingfeng Liu

elNava commented 1 year ago

Hi,

Sorry for the late reply. Thanks @markliuchina Mingfeng for your enourmous help. I tried deleting the pipes directory as @ACCarnall recommended above and it finally ran. I had more format problems since then that I should have been more careful about. I was asking for gaussian prior without specifying "mu" and "sigma" and also the "log10" prior does not exist, instead "log_10" must be used. Without correcting this the code enters to an endless loop dropping keyword errors.

When I corrected the previous, it ran properly. However there is another problem coming from the hdf5io.py script regarding data storage. You goes as follows

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[8], line 3
      1 # This does not work if the directory is already created.
      2 fit = pipes.fit(galaxy, fit_instructions)
----> 3 fit.fit()

File ~/.local/lib/python3.8/site-packages/bagpipes/fitting/fit.py:150, in fit.fit(self, verbose, n_live, use_MPI)
    148 with warnings.catch_warnings():
    149     warnings.simplefilter("ignore")
--> 150     dd.io.save(self.fname[:-1] + ".h5", self.results)
    152 os.system("rm " + self.fname + "*")
    154 self._print_results()

File ~/.local/lib/python3.8/site-packages/deepdish/io/hdf5io.py:584, in save(path, data, compression)
    582     idtable[id(data)] = '/'
    583     for key, value in data.items():
--> 584         _save_level(h5file, group, value, name=key,
    585                     filters=filters, idtable=idtable)
    587 elif (_sns and isinstance(data, SimpleNamespace) and
    588       _dict_native_ok(data.__dict__)):
    589     idtable[id(data)] = '/'

File ~/.local/lib/python3.8/site-packages/deepdish/io/hdf5io.py:250, in _save_level(handler, group, level, name, filters, idtable)
    246         _save_level(handler, new_group, entry, name=level_name,
    247                     filters=filters, idtable=idtable)
    249 elif isinstance(level, np.ndarray):
--> 250     _save_ndarray(handler, group, name, level, filters=filters)
    252 elif _pandas and isinstance(level, (pd.DataFrame, pd.Series)):
    253     store = _HDFStoreWithHandle(handler)

File ~/.local/lib/python3.8/site-packages/deepdish/io/hdf5io.py:125, in _save_ndarray(handler, group, name, x, filters)
    123     itemsize = x.itemsize
    124     atom = tables.StringAtom(itemsize)
--> 125 elif x.dtype == np.object:
    126     # Not supported by HDF5, force pickling
    127     _save_pickled(handler, group, x, name=name)
    128     return

File ~/.local/lib/python3.8/site-packages/numpy/__init__.py:305, in __getattr__(attr)
    300     warnings.warn(
    301         f"In the future `np.{attr}` will be defined as the "
    302         "corresponding NumPy scalar.", FutureWarning, stacklevel=2)
    304 if attr in __former_attrs__:
--> 305     raise AttributeError(__former_attrs__[attr])
    307 # Importing Tester requires importing all of UnitTest which is not a
    308 # cheap import Since it is mainly used in test suits, we lazy import it
    309 # here to save on the order of 10 ms of import time for most users
    310 #
    311 # The previous way Tester was imported also had a side effect of adding
    312 # the full `numpy.testing` namespace
    313 if attr == 'testing':

AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe. 
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

I think that the np.object usage is deprecated and this might be a problem that can be easily solved using proper numpy package version, but I wanted to confirm with you before changing things on my package environment. Although the previous error shows up, the fit object is created, but it does not have a posterior attribute and plotting mehtods of bagpipe fails.

Again thank you a lot for all your help, I will be looking forward to this issue. Cheers, Benjamin.

ACCarnall commented 1 year ago

Hi Benjamin,

There are some compatibility issues between recent versions of numpy and deepdish. Maybe try upgrading these and see if this fixes things? I'm running numpy==1.22.3 and deepdish==0.3.6, which seems to work ok. You might also want to check out recent issues on the deepdish github repository for more info.

Cheers, Adam