Closed maho3 closed 1 year ago
Comparison! On the same Quijote TPCF data point, we have:
SBI SNPE_C with one MAF:
Pydelfi with one MAF:
Not a rigorous comparison yet, but it's cool that they both work!
And here are the ranks stats for the Pydelfi+MAF model. Quite biased at the edges of the prior, but at least it looks like its learning something! To be improved...
Okay so... PyDELFI does indeed work! The issues we were seeing in the previous comment were a result of the fact that the sampling chains at inference time were not converging. The solution was to increase the burn-in time to 1000 samples-per-chain, and we get unbiased posteriors! Below is are some plots derived from our toy simulator example:
Here's a single constraint. Notice that there are a few chains that still hadn't converged when burn-in ended.
And here's the ensemble of constraints averaged over the test set.
Now, the bad part. The emcee sampling currently implemented in PyDELFI is really bad. The above plots were created using only 10 samples (after burn-in) for each of 200 test points, and it took ~45 minutes at inference time. If we were to evaluate this on all of Quijote, it could take days...
It is currently bulit to use MPI, but it's hard to get that working. It doesn't take advantage of batch evaluations on a GPU nor CPU multiprocessing. These are all changes that we could make in pydelfi_wrappers.py, but they are best reserved for a future PR. I have made issue #47 for this.
One more note, I've realized that, if you don't include the sequential training procedure, the likelihood estimation framework in PyDELFI is exactly that of SNLE_A in the sbi package. In fact, the pyDELFI paper directly points to the SNLE_A paper for its likelihood estimation fitting.
As a result, it may almost always be better to just use the sbi package instead of pyDELFI (due to its regular maintenance), but we should include both for completeness.
This probably needs some stress-testing. I hope to make a comparison plot of Quijote TPCF with sbi and pydelfi before making this a full PR.