matherealize / simdata

An R package for simulating data
https://matherealize.github.io/simdata/
7 stars 1 forks source link

Truncation #7

Closed AngelikaGeroldinger closed 3 years ago

AngelikaGeroldinger commented 3 years ago

If I understood correctly, the truncation applied by process_truncate is performed using different thresholds per data set (based on the quartiles of the specific data set ect). Some time ago, Georg Heinze convinced me that the thresholds should not be data dependent but only scenario dependent (the data generating process should not depend on the data already generated). So, I usually determine the threshold using a large data set before simulating and then apply this threshold in the simulation.

matherealize commented 3 years ago

Agreed. At the same time, truncation by statistics derived from a single dataset are something that may happen in practice and might be useful in a simulation study. Therefore I decided to keep the function, but add another one which provides truncation by fixed, user defined tresholds. That way truncation by global statistics derived from e.g. a big dataset without truncation can be implemented. I also added a paragraph to the vignette.

Implemented in 2dfedba0bfd23f923687b67cd010339c289a888d.