arives / rr2

R2 for correlated data; see https://academic.oup.com/sysbio/advance-article/doi/10.1093/sysbio/syy060/5098616
GNU General Public License v3.0
18 stars 7 forks source link

GLS with group structure #15

Open jydu opened 4 years ago

jydu commented 4 years ago

Hi,

I noticed that R2.pred and R2.resid do not work when the corStruct object has group structure (when corClasses is called with a "~1|Group"-like formula). In that case, the corMatrix function returns a list of matrices instead of a single matrix, which leads to an error. I'm not sure what would be the best way to solve that... maybe to combine all matrices into a big one, where covariance is zero between group levels?

Thanks a lot for the great package and functions, it is very useful!

Julien.

arives commented 4 years ago

Julien,

Sorry for my slow response. Coincidentally, I also just realized this week that it doesn't work with weightings either.

You are right – the solution is just to combine everything into a large matrix. I need to fix up some things with rr2, so I'll try to update the code in the next week.

Cheers, Tony

From: "Julien Y. Dutheil" notifications@github.com Reply-To: arives/rr2 reply@reply.github.com Date: Thursday, June 4, 2020 at 8:01 AM To: arives/rr2 rr2@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [arives/rr2] GLS with group structure (#15)

Hi,

I noticed that R2.pred and R2.resid do not work when the corStruct object has group structure (when corClasses is called with a "~1|Group"-like formula). In that case, the corMatrix function returns a list of matrices instead of a single matrix, which leads to an error. I'm not sure what would be the best way to solve that... maybe to combine all matrices into a big one, where covariance is zero between group levels?

Thanks a lot for the great package and functions, it is very useful!

Julien.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/arives/rr2/issues/15, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACYX6LEXNI37S2PX63SOQH3RU6LIPANCNFSM4NSTIC7A.

jydu commented 4 years ago

Dear Tony,

Many thanks for your answer. By weighting, you mean in case of non-homogeneous variance (the 'weight' argument that takes a varFunc function as input) ? Those can also have a group structure indeed, making things quite complex. We are currently interested in applying the rr2 functions to a model with non-homogeneous variance (but no group structure) and a covariance matrix (with group structure). Do you confirm that the r2.lik function, at least, works in such cases? (as it is only based on the likelihood, it should, right?)

All the best,

Julien.

arives commented 4 years ago

Julien,

Yes, I should have said this about R2_lik. And yes, by weighing I just meant non-homogeneous.

R2_lik values are sometimes a little lower than R2_pred (see the SysBio paper). Nonetheless, R2_lik will be completely valid and "robust" to any model specification, provided (i) you can fit the model with ML and (ii) you make sure the reduced model is appropriate (e.g., you have the weightings in the reduce model if the reduced model has independent variables removed, etc.).

Cheers, Tony

From: "Julien Y. Dutheil" notifications@github.com Reply-To: arives/rr2 reply@reply.github.com Date: Sunday, June 7, 2020 at 1:46 PM To: arives/rr2 rr2@noreply.github.com Cc: "Anthony R. Ives" arives@wisc.edu, Comment comment@noreply.github.com Subject: Re: [arives/rr2] GLS with group structure (#15)

Dear Tony,

Many thanks for your answer. By weighting, you mean in case of non-homogeneous variance (the 'weight' argument that takes a varFunc function as input) ? Those can also have a group structure indeed, making things quite complex. We are currently interested in applying the rr2 functions to a model with non-homogeneous variance (but no group structure) and a covariance matrix (with group structure). Do you confirm that the r2.lik function, at least, works in such cases? (as it is only based on the likelihood, it should, right?)

All the best,

Julien.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/arives/rr2/issues/15#issuecomment-640262277, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACYX6LHOMP3RNFGTCQ3UQJDRVPOBPANCNFSM4NSTIC7A.

jydu commented 4 years ago

Dear Tony,

This might be a naive question, but can we use R2.lik to assess how much "variance" is explained by the weighting, just like we ask how much is explained by the covariance of the residues? As we can consider a reduced model without phylogeny, we could also consider one without modelling of heteroscedasticity, or does it make no sense? (This is a bit of a rhetorical question as we are indeed mostly interested by the fixed effects of the model. But this could be interesting... in our case a model where the predicted variance is proportional to the mean fits the data much better.)

Many thanks for your insights,

Julien.

arives commented 4 years ago

Julien,

I just updated rr2, so please give it a spin and let me know if this addressses the Grouping factor issue in correlation for gls. I have now restricted R2_resid to only gls models with phylogenetic covariance structures, because R2_resid requires some convention for standardizing covariance matrices.

As for using R2_lik for comparing weightings, I think this is fine. Or maybe it is more direct just to compare the logLik. As for the reduced model without covariances, I think you could use a weighted or unweighted reduced model. In the examples for R2, I decided to use weights in the lm() reduced model.

Cheers, Tony

jydu commented 4 years ago

Dear Tony,

thanks a lot! The issue is solved in terms of the corMatrix function, but it still does not seem to work: Error in solve.default(VV) : Lapack routine dgesv: system is exactly singular: U[42,42] = 0 So it seems that the many zero entries in the compound covariance matrix are problematic. Maybe there is a reason why group factors are implemented as a list of matrix and not a big matrix, and that some optimization tricks exist that should also be used here? Because the model I tested can be fitted using GLS, I do not get singularity errors. I am not a specialist, but I suspect that in case of grouping, the likelihood might be computed for each group independently and then somehow combined, preventing to work with large sparse matrices. Maybe R2 should be computed in a similar way?

Best,

Julien.

arives commented 4 years ago

Julien,

Would you mind sending me enough code and data files to run your model? I think that's the only way I'm going to be able to work this out. I did run some test examples that I had made with grouping factors in the correlation structure, and rr2 handled them okay.

Cheers, Tony

From: "Julien Y. Dutheil" notifications@github.com Reply-To: arives/rr2 reply@reply.github.com Date: Sunday, June 14, 2020 at 10:11 AM To: arives/rr2 rr2@noreply.github.com Cc: "Anthony R. Ives" arives@wisc.edu, Comment comment@noreply.github.com Subject: Re: [arives/rr2] GLS with group structure (#15)

Dear Tony,

thanks a lot! The issue is solved in terms of the corMatrix function, but it still does not seem to work: Error in solve.default(VV) : Lapack routine dgesv: system is exactly singular: U[42,42] = 0 So it seems that the many zero entries in the compound covariance matrix are problematic. Maybe there is a reason why group factors are implemented as a list of matrix and not a big matrix, and that some optimization tricks exist that should also be used here? Because the model I tested can be fitted using GLS, I do not get singularity errors. I am not a specialist, but I suspect that in case of grouping, the likelihood might be computed for each group independently and then somehow combined, preventing to work with large sparse matrices. Maybe R2 should be computed in a similar way?

Best,

Julien.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/arives/rr2/issues/15#issuecomment-643779827, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACYX6LBKSBDNUZFTXZE4A5TRWTSBRANCNFSM4NSTIC7A.

jydu commented 4 years ago

Done, hope this helps!

Julien.

nicolaycunha commented 1 year ago

Hi there,

I notice that R2 function is not working properly with non-phylogenetic GLS that has a autocorrelation structure.

My model is as follows:

model <- gls(y ~ poly(x1, 2) + poly(x2, 2),
  data = data,
  na.action = na.omit,
  correlation = corSpher(form = ~lat+lon, nugget = TRUE),
  method = "REML")

but when running the function I get the below error:

Models of class gls that are not phylogenetic do not have a R2_resid method.
Error in UseMethod("corMatrix") : 
  no applicable method for 'corMatrix' applied to an object of class "NULL"

Based on the open issue, I assume that my case is related.

Any help is much appreciated.

Cheers, Nicolay

daijiang commented 1 year ago

Hi @nicolaycunha , it would be helpful that you can provide a minimal reproducible example with data. Thanks.

arives commented 1 year ago

Nicolay,

The problem is that gls() doesn’t return the full covariance matrix of the random errors, so that we had to reconstruct the covariance matrix. We only did this for phylogenetic structures from ape. You can see the issue if you want by looking at

rr2:::R2_resid.gls.phylo

There is no mathematical reason R2_resid couldn’t work with any covariance structure, but we didn’t want to have to reconstruct every possible covariance matrix. If you really want to use R2_resid, you can write your own version equivalent to rr2:::R2_resid.gls.phylo. Or, more simply, you can just use the other two R2s.

Cheers, Tony

From: Daijiang Li @.> Date: Wednesday, June 7, 2023 at 5:48 PM To: arives/rr2 @.> Cc: Anthony R. Ives @.>, Comment @.> Subject: Re: [arives/rr2] GLS with group structure (#15)

Hi @nicolaycunhahttps://github.com/nicolaycunha , it would be helpful that you can provide a minimal reproducible example with data. Thanks.

— Reply to this email directly, view it on GitHubhttps://github.com/arives/rr2/issues/15#issuecomment-1581258717, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACYX6LBSTNFAIGV6U4OHTMLXKC5FJANCNFSM4NSTIC7A. You are receiving this because you commented.Message ID: @.***>

nicolaycunha commented 1 year ago

Hi guys, many thanks for your fast replies.

When I was preparing a MRE as Daijiang Li asked for, I noticed that the error I am getting is actually related to a computational problem given the size of my data frame (3000 lines), so incorporating the spatial autocorrelation in my model is causing the problem. I tried with a subset of my data and was able to get R2_lik and R2_pred, which are sufficient for me as Tony suggested.

I will try to find a way to run my models faster, as they are taking about 30 minutes to conclude. Please let me know if you have tip for this case.

Thanks once again for the package and fast reply.

Cheers, Nicolay