be able to generate correlated variables

peterhurford commented 1 year ago

One potential idea: https://stackoverflow.com/questions/18683821/generating-random-correlated-x-and-y-points-using-numpy

agucova commented 1 year ago

Instead of hardcoding it for certain distributions, we can apply the more general Cholesky transformation on sampling. See here. I can give this a try, but I'm not a statistician, and we might want some feedback from one.

agucova commented 1 year ago

@peterhurford what do you think?

peterhurford commented 1 year ago

@agucova Looks awesome!

peterhurford commented 1 year ago

https://github.com/rethinkpriorities/squigglepy/issues/45 may be related

agucova commented 1 year ago

@peterhurford I've talked with Jaime from Epoch, and they've told me they've been using copula-based methods as well. The way I've implemented the Iman-Conover method should be mathematically equivalent to the normal copula method, but copulas can be more flexible in certain situations.

In particular, I've noticed the IC method struggles with discrete distributions because there's often not enough variability in the samples themselves for a reorder to be able to induce a certain correlation. I'm not sure if we can fix this with copulas, but it might be a good next step.

rethinkpriorities / squigglepy

be able to generate correlated variables #24