This is motivated specifically by a Dirichlet-Multinomial case (see nimble-users post 2024-10-16) with nested Dirichlet distributions and conjugate sampling for the Dirichlet-distributed variable that is a dependency of the CRP cluster memberships. Much of the issue occurs when the Dirichlet dependency has an exact 0 because of numerical underflow. Not clear whether the numerical issues would arise in other cases.
The PR:
converts any NaN values in curLogProb to -Inf to allow sampling to proceed but not select the cluster with the NaN logProb
errors out if all curLogProb values are -Inf as not clear what to do in that case that would be legitimate.
gives a warning if multiple curLogProb values are Inf that results may not be valid. In this case that will result in uniform sampling from the clusters with those Inf logProbs. There is also a good argument for erroring out as it's not clear that one would get a valid chain in this case (since the Inf values would probably not be equal if we were working in higher precision and could resolve them to their actual finite values). But it seems most user-friendly to give the results and let the user decide.
This is motivated specifically by a Dirichlet-Multinomial case (see nimble-users post 2024-10-16) with nested Dirichlet distributions and conjugate sampling for the Dirichlet-distributed variable that is a dependency of the CRP cluster memberships. Much of the issue occurs when the Dirichlet dependency has an exact 0 because of numerical underflow. Not clear whether the numerical issues would arise in other cases.
The PR:
curLogProb
to-Inf
to allow sampling to proceed but not select the cluster with the NaN logProbcurLogProb
values are-Inf
as not clear what to do in that case that would be legitimate.curLogProb
values areInf
that results may not be valid. In this case that will result in uniform sampling from the clusters with thoseInf
logProbs. There is also a good argument for erroring out as it's not clear that one would get a valid chain in this case (since theInf
values would probably not be equal if we were working in higher precision and could resolve them to their actual finite values). But it seems most user-friendly to give the results and let the user decide.