sbi-benchmark / sbibm

Simulation-based inference benchmark
https://sbi-benchmark.github.io
MIT License
88 stars 34 forks source link

fixed the gaussian mixture example #54

Closed h3jia closed 1 year ago

h3jia commented 1 year ago

Not sure if this is pyro/torch-version dependent, but I found that if I gave a bunch of theta's at the same time, the samples will either all from the smaller Gaussian or all from the larger Gaussian.

I think the issue is that it uses the same single call of pdist.Categorical regardless of the input theta size. With a simple fix, now the samples look correct on my side.

janfb commented 1 year ago

Thanks for reporting this and suggesting a fix @h3jia

I can confirm that the current implementation selects only a single index independently of the number of passed parameters.

@jan-matthis I suggest to label this as a bug. Can you request me as a reviewer then I can review the fix suggested by @h3jia

In a follow-up PR I could add documentation and a corresponding test.

jan-matthis commented 1 year ago

@h3jia thank you very much for reporting this! Merged #63 in favour of this PR, which adds a test, and released the fix as part of v1.1.0.

Since experiments in the paper were run with a simulation batch size of 1000, this has an effect on results. We will issue an update to account for this.

h3jia commented 1 year ago

@jan-matthis Glad to hear that this has been merged. Can I ask some related questions: what's the optimizer setup for the benchmark results (or where can I find the info if it's somewhere)? Did you tune e.g. the learning rates for each individual problem and method, or do you just use some universal value?

And you mentioned that the batch size was 1000; is this only for the 100k budget cases? I guess you'd use a smaller batch size if you only have a budget of 1k-10k.

jan-matthis commented 1 year ago

@jan-matthis Glad to hear that this has been merged. Can I ask some related questions: what's the optimizer setup for the benchmark results (or where can I find the info if it's somewhere)? Did you tune e.g. the learning rates for each individual problem and method, or do you just use some universal value?

Sure! All configs are in https://github.com/sbi-benchmark/results/tree/main/benchmarking_sbi/config. We did not tune hyperparameters per task but rather across tasks. Our rationale is in the paper, section 4, last paragraph.

And you mentioned that the batch size was 1000; is this only for the 100k budget cases? I guess you'd use a smaller batch size if you only have a budget of 1k-10k.

simulation_batch_size generally was 1000, but note that it can get clipped. For example, when running a neural sequential algorithm, it gets clipped to maximally be num_simulations_per_round.