galacticusorg / galacticus

The Galacticus galaxy formation model
GNU General Public License v3.0
27 stars 18 forks source link

MCMC tutorial #738

Open Andrew-Robertson opened 3 hours ago

Andrew-Robertson commented 3 hours ago

I was trying to follow Tutorial: Constraining Galacticus Parameters. The returned log-likelihoods were $\sim -10^{293}$, which I believe was because the likelihood was comparing the mean stellar mass for all halo mass bins, but Galacticus was only populating one of these bins with halos, due to:

  <mergerTreeBuildMasses           value="fixedMass"                  >
    <massTree          value="1.0e12"/>
    <treeCount         value="4"     />
  </mergerTreeBuildMasses>

I think that parameters/tutorials/mcmcBase.xml needs a couple of small changes for this to run successfully. The likelihoodBin parameter should be likelihoodBins, and the bin corresponding to a halo mass of $10^{12} ~ M_\odot$ should be bin 9, not bin 11.

https://github.com/galacticusorg/galacticus/blob/a1e98d06453787cab7b691c196cd520ea8ddddb0/parameters/tutorials/mcmcBase.xml#L142

With these changes Galacticus is now running, and producing what appear to be sensible log likelihood values, though it is yet to complete, so I don't know that everything works successfully.

abensonca commented 2 hours ago

Those changes look good. In the case of likelihoodBin<-->likelihoodBins I think this issue arose because originally this was likelihoodBin - i.e. it supported only a single bin (or all bins), but was later changed to allow multiple bins.

In any case, those changes look good if you want to go ahead and open a PR. I think the tutorial page needs changing also (since these parameter settings are also listed there).

Andrew-Robertson commented 2 hours ago

Okidoki. I can make those changes, though at the moment the results don't correspond to a converged output of samples drawn from the posterior (though maybe this is fine for a demo that shouldn't take forever to run). I ran 8 chains, and the distribution of the parameter values in the 8 output chain files is plotted below (using KDE to smooth them out).

image

I don't really know how to read the files related to convergence, but maybe these already say not to trust the results: Convergence files: mcmcChains_0000.convergence.log mcmcConvergence.log

And here is one of the 8 chain files: mcmcChains_0000.log

Also, here are the config files (as .txt as GitHub doesn't seem to like uploading a xml files) mcmcConfig.xml.txt mcmcBase.xml.txt

abensonca commented 2 hours ago

Yes, definitely not converged. This tutorial model is only allowed to run for ten steps, and then stopped. The convergence files could use documenting. Basically:

 outliers              10 T T T T T T T T

means all chains are considered to be outliers (mostly meaningless with this few samples), and:

 convergence           10   4.0282675885835779        4.0282675885835779        4.0282675885835779     

gives the min and max Gelman-Rubin $\hat{R}$ statistic across all parameters, followed by $\hat{R}$ for each parameter individually (and there's only one parameter here, so the values are all the same).