ACCarnall / bagpipes

Bagpipes is a state of the art code for generating realistic model galaxy spectra and fitting these to spectroscopic and photometric observations. Users should install with pip, not by cloning the repository.
http://bagpipes.readthedocs.io
GNU General Public License v3.0
71 stars 37 forks source link

mpirun fit_catalog stalls on galaxy that converges fine individually #12

Open thriveth opened 3 years ago

thriveth commented 3 years ago

First of all, thanks for writing this code and making it available, and thanks for a very good documentation that flattens the learning curve a great deal!

I am trying to fit a catalog using fit_catalog in a script that I run through mpirun, as suggested in the example notebooks. However, the function stalled after a few objects, for an entire night with four cores running at full capacity before I killed the process.

Trying to fit the same galaxies one-by-one, also using mpirun, goes fine, they typically converge within a couple of minutes. I first tried fitting the offending galaxy individually, then running the fit_catalog function again, but it stalled on the next one, too, which also fits fine when fit individually.

The only feedback I get is some warnings that various parameters are converging at the edge of the prior. I get the same warnings, but much fewer and no stalling, when fitting individually.

Any idea what is going on?

ACCarnall commented 3 years ago

Hi Thøger,

Glad you're getting on well with the code!

I can't immediately think of a reason running fits within fit_catalogue would cause a problem that you don't get when fitting objects individually. Is it possible you're accidentally doing something different with redshifts when fitting the catalogue, i.e. leaving the redshifts free instead of fixing them?

There is of course a degree of randomness in the times taken for MultiNest fits to complete. It's not unusual to see variations of up to say and order of magnitude when fitting different objects with the same data, or even just re-running the same fit on the same object several times. However it sounds like what you're experiencing is outside of the normal variation you'd expect. Likewise, warnings about converging to the edge of the prior aren't necessarily a problem, it just means the edge of the prior is included in the sampling at that stage in the process, it can easily become excluded later.

Have you tried wiping the pipes directory and re-running the fit_catalogue instance to see whether you reliably hit the same snag on the same object? Or re-running the individual fit on the problem object to check it reliably finishes in a sensible time? I'd be happy to look at some code if you'd like to make it available to me.

thriveth commented 3 years ago

Hi Adam,

Thanks for your reply.

I am using the same script and more importantly the same fit_instructions dictionary whether I am fitting catalog or individually, the only difference is whether I call one function or the other. So I think there should be minimum risk of things being different. Specifically, redshifts are free in all cases (phot-z is one of the main things I am interested in getting).

It seems like several files gave problems which all worked fine, with at least a couple of runs for at least one of them (testing out various options).

I wonder if it has something to do with the data being in "ergscma" - the fit_catalog function does not seem to take the phot_units keyword that the simple pipes.fit function does. Does fit_catalogue only support data in Microjansky?

ACCarnall commented 3 years ago

Hmm, yeah that sounds very plausible. Setting phot_units should really be an option in fit_catalogue, this is the first time I've noticed that you can't. If that's what's going on you should be able to tell by plotting the outputs and seeing if the data looks the same when fitted by fit and fit_catalogue?