wbonat / mcglm

Fitting multivariate covariance generalized linear models
GNU General Public License v3.0
21 stars 5 forks source link

Issues with data formatting for 91 DVs and 4 IVs for mcglm #19

Closed MaggieDee closed 3 years ago

MaggieDee commented 3 years ago

I have 91 proteins and I want to see how different Independent variables are driving their presence or abundance and if there is any correlation . I want to use a multivariate glm so I read your paper and have been trying to use mcglm. I was wondering if there was a way I could do this without having to mannually having to input the formula for each of the 91 proteins and do it in one shot? I tried doing it this way but I get this error following error for some proteins when I try to it seperate models, (also is there a way I can export the output into a table in excel? ):

Automatic initial values selected. Error in .local(x, ...) : internal_chm_factor: Cholesky factorization failed Error: $ operator is invalid for atomic vectors

wbonat commented 3 years ago

Dear Maggie,

Unfortunately, I did not provide any smart way to specify such a large models or export output to Excel. I really doubt you will be able to fit such a large model without a very careful specification. You have to be careful with the sample size and the number of covariates for each response variable, correct specification of the matrix linear predictor, link and variance functions. Note that only in the correlation matrix you will have 91*90/2 = 4095 parameters plus 91 dispersion parameters along with the regression coefficients. I never fit a model that large in the mcglm package. You will need a lot of workaround including link some more efficient linear algebra library in the native R session, some that allow parallel computing and probably have a server with large memory and many cores.

As an initial strategy I would fit a bi variate model for every pair of responses and think carefully if I really need to fit the multivariate model. If you give more details, perhaps I can help you. The error that you got is probably something related with the model specification. You can have a look at http://mcglm.leg.ufpr.br/ for a general explanation concerning how to specify the model's components.

Good look