Open vnmanoharan opened 4 years ago
This method appears to make the sampler run out of bounds, and when I can get it to run the fit is bad:
I thought maybe it should be sep_observed and cr_observed instead of sep_angular and contrast_ratios, but that returns 'bad initial energy'
I think I understand the math form of the likelihood function now (the Stan Manual and your notes were very helpful). The remaining parts I don't understand (and where maybe something is going wrong) are:
pm.Potential() - I don't understand exactly what this does and am having trouble finding a mathematical explanation, maybe it's being used incorrectly here
I'm not sure that using normal_lcdf is equivalent to taking the log of the lcdf. Since it's normalized, perhaps it does need to be multiplied by a parameter, N, which described the number of samples that fall outside the range? Vinny, you said this is mathematically equivalent to the March paper. They use the parameter N for truncated data, which indicates to me that maybe it is important to include in our model.
(I've tried just scaling each normal_lcdf funciton by it's own parameter N but cannot get the model to run)
Good question. The factor of N should be included implicitly because sep_angular is a vector. pm.Potential is not well documented but it adds a term to the log posterior.
To see a simplified notebook looking at this issue, see troubleshooting_model.ipynb in the model_testing branch
I think the correct way to truncate the likelihood would be to do it in one step, using
This truncates the likelihood on both sides. The
logdiffexp
function is a more numerically stable way of working with the lcdf. I translated this formulation from the Stan manual, "Truncation with lower and upper bounds in Stan". That section of the manual also explains why the likelihood has this form.If you change from
cr_inverse
as your observations tocr
(see #39) you'll have to replace thesep_min
function above with one that operates on the contrast ratios.