bd-j / prospector

Python code for Stellar Population Inference from Spectra and SEDs
http://prospect.readthedocs.io
MIT License
153 stars 71 forks source link

MCMC sampling finding very far-off MAP values #331

Open elmiio opened 3 months ago

elmiio commented 3 months ago

Hello,

I have recently run into an issue while modelling using prospector. Essentially, the emcee sampling results in Maximum a Posteriori (MAP) estimations that are very obviously incorrect.

image

In this corner plot, you can see that walker chains have settled on a very small "local maximum" in the mass parameter, despite it very clearly not being the right solution when compared to the rest of the probability distribution. As a result, I obtain a MAP spectrum that is also very clearly far off from the data:

image

I have since been able to avoid this issue by adjusting the prior on my mass parameter, but I am still wondering how the program able to produce such a result in the first place. It may be helpful to know so that I (or anyone else) do not run into a similar issue in the future.

Thank you! Omar

bd-j commented 2 months ago

Hi,

I'm not sure how you are getting the MAP parameters from the prospector outputs. It is not at all obvious to me that the walker chains "settled" on the blue cross in the corner plots, if that is what you are suggesting is the MAP solution (which I agree is clearly a terrible solution.)

Anyway it should not be possible to have a higher MAP for that solution compared to something that's 7 orders of magnitude brighter like the bulk of the samples, so I wonder if there isn't some issue in how the MAP parameters are being obtained from the posterior chain. More details on how the MAP parameters you're plotting were obtained would be helpful. Thanks!

elmiio commented 1 month ago

Hello,

The MAP parameters that I am plotting (mass, metallicity and age) were found through minimization and MCMC sampling. The initial values used were [1.0e7, -0.5, 5] for the mass, metallicity and age, respectively. The enabled run parameters for the minimization and emcee sampling were the following:

MINIMIZATION: run_params["dynesty"] = False run_params["emcee"] = False run_params["optimize"] = True run_params["min_method"] = 'lm' run_params["nmin"] = 5

EMCEE: run_params["optimize"] = False run_params["emcee"] = True run_params["dynesty"] = False run_params["nwalkers"] = 128 run_params["niter"] = 1024 run_params["nburn"] = [100, 200, 400]

Thank you!

bd-j commented 1 month ago

Hi @elmiio, thanks for the information. I would still like to know how exactly you are extracting the best fit parameters from the prospector output. If you could post a code snippet that would be great. Thanks.

elmiio commented 1 month ago

image image image image

This is the code used to extract the best fit parameters from the prospector output. I hope this helps.

bd-j commented 1 month ago

I think these snippets are only missing the last part where you extract a set of parameters from result and plot it in the corner plot as the blue point.

elmiio commented 1 month ago

I've updated my last comment to include what I believe is a snippet of the code you are referring to. Please let me know if there is any additional information needed.

bd-j commented 1 month ago

Thanks, can you also print the prospector version, theta_max, the maximum of result["lnprobability"], and the value of result["lnprobability"][i, j]?