rho62 commented 2 years ago

Approach pursued so far

Sample from original (not truncated) distribution, followed by a truncation. In-efficient approach: Samples a surplus of unnecessary elements and difficult to predict the sample size required to achieve the target sample size.

Solution

Sample directly from the truncated distribution:

$$ f_T(x; \theta) = \frac{f(x; \theta)}{[F(b) - F(a)]} $$

Use sample() to sample from $a, ..., b$ with weights $f_T(a, ..., b; \theta)$

Implemented for binomial. See code there. Needs to be implemented for other discrete distributions: Poisson, Neg. bin. others?

[x] Binomial
[x] Negative binomial
[x] Poisson

OBS: weights $f_T(x; \theta)$ are already implemented as dtrunc.XXXX() functions

wleoncio commented 2 years ago

This is the same as #72, isn't it? Also, is there any impediments for implementing this for continuous distributions as well?

rho62 commented 2 years ago

Perhaps... not sure... Seems to me, that there is a coding issue (only calling rtrunc vs sampleFromUntruncated) and a content/solution issue: How do we actually do it?

/R

Fra: Waldir Leoncio @.> Svar til: ocbe-uio/TruncExpFam @.> Dato: mandag 21. februar 2022 kl. 11:11 Til: ocbe-uio/TruncExpFam @.> Kopi: Rene Holst @.>, Author @.***> Emne: Re: [ocbe-uio/TruncExpFam] Rewrite functions for sampling from discrete truncated distributions (Issue #77)

This is the same as #72https://github.com/ocbe-uio/TruncExpFam/issues/72, isn't it? Also, is there any impediments for implementing this for continuous distributions as well?

— Reply to this email directly, view it on GitHubhttps://github.com/ocbe-uio/TruncExpFam/issues/77#issuecomment-1046695513, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFPRUPWKSFLENKXSPDR3Z53U4IFTRANCNFSM5O6C6QLA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you authored the thread.Message ID: @.***>

wleoncio commented 2 years ago

On second thought, I think you're right. Seems wise to separate things and leave #72 for the duplicated coding issue and #77 and #78 for the slow-sampling issue.

wleoncio commented 1 year ago

If I understood correctly, the Binomial implementation is here:

https://github.com/ocbe-uio/TruncExpFam/blob/087ad1bbab0861a649d3851647db18094c1aad76/R/binomial.R#L15-L31

The f(x) / [F(b) - F(a)] part is clearly defined on L28. f(x) (i.e., dens) is transformed on L25 using my.dbinom(). If that is correct, then this idea could be replicated for rtrunc(), but unless I'm missing something there's no sampling involved on the function above, only rescaling of the densities (as expected, since resampling is only part of the r* fucntions).

So a DRY solution might involve the following steps:

[x] Extract the calculation of f_T(x) from the dtrunc methods into its own function. Could be a generic, since the x argument would have different rtrunc_ classes
[x] Use the extracted function from the previous step on a new version of rtrunc(), temporarily coexistent with the current implementation
[ ] Phase out the old function in favor of the new implementation
[ ] Adjust test unit expectations

One thing that worries me about this approach is that this will probably make the output of rtrunc() not match their stats counterparts anymore, since the untruncated distribution will no longer be the base for the sampling. Is this acceptable?

wleoncio commented 1 year ago

An alternative to phasing out the old algotirhm is to have rtrunc() contain an argument (like a boolean legacy) that will run the old stats-compatible algo. This gives the user control over comparability with stats results vs speed of the new implementation.

ocbe-uio / TruncExpFam

Rewrite functions for sampling from discrete truncated distributions #77

Approach pursued so far

Solution