Closed thaonguyen-lanl closed 4 years ago
Unfortunately, there is no guarantee that automatic step size tuning will be better than the default step sizes. The algorithm we have implemented uses short chains over multiple step sizes, assesses acceptance rate, then interpolates using logistic regression to try to pick a step size that will give an "optimal" acceptance rate. In my experience, it does not solve all sampling problems and sometimes using defaults, or even hand-tuning, can give what appear to be better results. I think it's very problem dependent.
What arguments did you pass to tune_step_sizes()? Have you tried the same problem with GPMSA and used auto step size tuning? We weren't able to verify that GPMSA and Sepia would give exactly the same step size tuning results due to differences in the logistic regression algorithms, but in my tests they gave very similar results.
I used "model.tune_step_sizes(50, 20)". I have used GMPSA to tune step size in 2019, when I still have the student licenses to run the matlab tools required to run GPMSA. Since 2020, I do not use it, so I could not check and compare on the same case. But from my experience, I have not seen the behavior of hanging at the boundary (0 or 1) as I observed with Sepia. The problems I have been working on are very similar between 2019 and now.
OK, thank you. We are looking into it now with a simulated example where we saw similar behavior. I will post here with any updates. For now, try using the default step sizes.
So the step size selection itself was fine, but to be compatible with GPMSA, we now set the start value to the last value from the step size chain (instead of starting at the default initial start value). This seems to perform better in our simulated examples. Can you try the latest version of the code and see if it performs better for your problem?
Yes, I can try. Could you please let me know which python script was updated and need to be downloaded before the run?
Thanks.
It is SepiaModel.py
Thanks! Yes, that behavior disappears! It gets to the desired distribution quicker than without using it.
OK. We may adjust how the function sets start values in the future, but for now it should behave similar to GPMSA. I am closing the issue for now.
I tested with multiple variable, but the result of calibration with step size optimization is worse than without using it. The MCMC draws tends to stay on the boundary ( 0 or 1) rather than getting to the supposed distribution.