DeclareDesign / fabricatr

fabricatr: Imagine Your Data Before You Collect It
https://declaredesign.org/r/fabricatr
Other
92 stars 11 forks source link

helpful error/warning about sigma matrix for link_levels #95

Closed graemeblair closed 6 years ago

graemeblair commented 6 years ago

If we can process the sigma matrix and identify cases where linking levels will not be possible (if not positive semi-definite?) then let's do that when they run it and so it is spit out when you run declare_population if possible rather than each time it simulates it. So this may be really about handling in declare_population.

aaronrudkin commented 6 years ago

I had thought there was a set of restrictions other than PSD on a valid correlation matrix, but if it ends up just being PSD then this check will be easy. If anyone faster than me has a cite for this I'll steamroll ahead; but if not I'll do research ASAP and get this warning in.

nfultz commented 6 years ago

I think if people pass in Sigma, we should throw an error if low-rank chol() fails, if they pass in chol(Sigma) assume they know what they are doing.

On Wed, Mar 7, 2018 at 10:10 AM, Aaron Rudkin notifications@github.com wrote:

I had thought there was a set of restrictions other than PSD on a valid correlation matrix, but if it ends up just being PSD then this check will be easy. If anyone faster than me has a cite for this I'll steamroll ahead; but if not I'll do research ASAP and get this warning in.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/DeclareDesign/fabricatr/issues/95#issuecomment-371230265, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZjTtiXFOL5R3eLNQJq6Tge5eEzZFYUks5tcCKWgaJpZM4SgwRE .

aaronrudkin commented 6 years ago

I'm trying to do this and am not sure if chol(Sigma) is what we want. Cholesky decomposition requires a positive definite matrix and correlation matrices can be positive semi-definite (e.g. matrix(c(1, 0, 0, 0, 1, 1, 0, 1, 1), byrow=TRUE, ncol=3, nrow=3) is a valid correlation matrix)

chol(Sigma) on that matrix will fail but it should be fine for the correlation; all(eigen(Sigma)$values >= 0) should be a reliable check for PSD matrices.

It does look like, based on some searching, that it's demonstrable than being PSD and having all entries between -1 and 1 is sufficient for something to be a correlation matrix, so I'll go ahead and implement that, yeah?

aaronrudkin commented 6 years ago

Current error stack for sigma, each with their own descriptive errors:

  1. Error if matrix is non-square or diagonal contains a number other than 1
  2. Error if matrix is not symmetric
  3. Error if any cells are outside [-1, 1]
  4. Error if matrix is not PSD

Just need to write tests.

aaronrudkin commented 6 years ago

LOL my error handling errors if you get complex eigenvalues, one sec.

Edit: Also forgot that correlation matrices need to be symmetric in addition to PSD and [-1, 1] bounded.

nfultz commented 6 years ago

Use chol(pivot=TRUE) and permute the random draws appropriately to deal with linearly dependence.

On Wed, Mar 7, 2018 at 1:32 PM, Aaron Rudkin notifications@github.com wrote:

LOL my error handling errors if you get complex eigenvalues, one sec.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/DeclareDesign/fabricatr/issues/95#issuecomment-371292331, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZjTrB8FB80aci-EnN6hl-65y1Nx77kks5tcFHtgaJpZM4SgwRE .