TiagoOlivoto / metan

Package for multi-environment trial analysis
https://tiagoolivoto.github.io/metan/
GNU General Public License v3.0
35 stars 17 forks source link

error while analysis path analysis #15

Closed drschowdary closed 2 years ago

drschowdary commented 2 years ago

pcoeff <- path_coeff_mat(cor_mat, resp = TGW)

Error in solve.default(cor.x, cor.y) : system is computationally singular: reciprocal condition number = 8.72837e-18

TiagoOlivoto commented 2 years ago

Dear @drschowdary , thanks for reporting this issue. This probably occurs because the correlation matrix between predictors (cor_mat) is not positive definite. Could you please provide a reproducible example? See how to use reprex to do that (https://reprex.tidyverse.org/)

Here, I show a similar error, forcing an exactly singular matrix (determinant = 0)

library(metan)
#> Registered S3 method overwritten by 'GGally':
#>   method from   
#>   +.gg   ggplot2
#> |=========================================================|
#> | Multi-Environment Trial Analysis (metan) v1.16.0        |
#> | Author: Tiago Olivoto                                   |
#> | Type 'citation('metan')' to know how to cite metan      |
#> | Type 'vignette('metan_start')' for a short tutorial     |
#> | Visit 'https://bit.ly/pkgmetan' for a complete tutorial |
#> |=========================================================|
mat <- 
  data_ge2 %>% 
  mutate(EH2 = EH) %>% # force a perfectly correlated predictor
  select_numeric_cols() %>% 
  corr_coef()

pcoeff <- path_coeff_mat(mat$cor, resp = EP)
#> Error in solve.default(cor.x, cor.y): Lapack routine dgesv: system is exactly singular: U[15,15] = 0

Created on 2021-12-27 by the reprex package (v2.0.1)

TiagoOlivoto commented 2 years ago

Since this clearly is not a bug but an issue related to the data, I'm closing this issue. Feel free to re-open it if necessary.

drschowdary commented 2 years ago

Sir, but with the same correlation matrix the path analysis is done in GENES!

On Wed, 29 Dec, 2021, 7:00 pm TiagoOlivoto, @.***> wrote:

Closed #15 https://github.com/TiagoOlivoto/metan/issues/15.

— Reply to this email directly, view it on GitHub https://github.com/TiagoOlivoto/metan/issues/15#event-5824776624, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXB5YQTIBMVZC3N2GXQNDCTUTMEQDANCNFSM5K2BJLVQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

TiagoOlivoto commented 2 years ago

@drschowdary, path analysis is done in GENES probably because GENES uses a different method of matrix inversion, which returns an approximate inverse of a matrix even when it is singular (see an example in metan at https://tiagoolivoto.github.io/metan/reference/solve_svd.html). I strongly suggest you check the multicollinearity of the matrix of predictor traits (use metan::colindiag()), because your path coefficients might be highly biased.

drschowdary commented 2 years ago

Yes, sir, There is severe multicollinearity found in the data. How to do path analysis under collinearity using metan?

On Thu, Dec 30, 2021 at 12:55 AM TiagoOlivoto @.***> wrote:

@drschowdary https://github.com/drschowdary, path analysis is done in GENES probably because GENES uses a different method of matrix inversion, which returns an approximate inverse of a matrix even when it is singular (see an example in metan at https://tiagoolivoto.github.io/metan/reference/solve_svd.html). I strongly suggest you check the multicollinearity of the matrix of predictor traits (use metan::colindiag()), because your path coefficients might be highly biased.

— Reply to this email directly, view it on GitHub https://github.com/TiagoOlivoto/metan/issues/15#issuecomment-1002744831, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXB5YQW5AHCE5PH3DAR6N2LUTNOCNANCNFSM5K2BJLVQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

-- with warm regards *D. Rajasekhar[image: Flag India animated gif 120x90] PhD Scholar [Genetics & Plant Breeding* (SRF)] School of Crop Improvement, College of Post Graduate Studies in Agricultural Sciences, Central Agricultural University (Imphal) Umiam, Meghalaya-793103 📱: +919491233510

TiagoOlivoto commented 2 years ago

@drschowdary, I just changed the method of matrix inversion so that a pseudo-inverse is computed even with a singular matrix. A warning will alert you about this problem. To get this new feature, install the development version of metan with

devtools::install_github("TiagoOlivoto/metan")

After installing the development version, you'll be able to run the following examples. There are two main ways of dealing with multicollinearity (see our paper for more info). The first (and that I'd suggest) is removing the traits that are being the main cause of the problem. The second is including a correction factor (k) in the diagonal of the correlation matrix.

library(metan)
#> Registered S3 method overwritten by 'GGally':
#>   method from   
#>   +.gg   ggplot2
#> |=========================================================|
#> | Multi-Environment Trial Analysis (metan) v1.16.0        |
#> | Author: Tiago Olivoto                                   |
#> | Type 'citation('metan')' to know how to cite metan      |
#> | Type 'vignette('metan_start')' for a short tutorial     |
#> | Visit 'https://bit.ly/pkgmetan' for a complete tutorial |
#> |=========================================================|

df_colinear <- 
  data_ge2 %>% 
  select(PH, EP, KW, NKE) %>% 
  mutate(PH2 = PH)

# now it runs with a warning
pc1 <- path_coeff(df_colinear, resp = KW)
#> Warning: System is computationally singular. Check the collinearity of
#> predictors with `metan::colindiag()`.
#> Severe multicollinearity. 
#> Condition Number: 50796137447998448
#> Consider using a correction factor with 'correction' argument.
#> Consider identifying collinear traits with `non_collinear_vars()`
plot(pc1)

# include a correction factor, say, 0.05
pc2 <- path_coeff(df_colinear, resp = KW, correction = 0.05)
#> Weak multicollinearity. 
#> Condition Number: 56.782
#> You will probably have path coefficients close to being unbiased.
plot(pc2)


# identifying colinear traits
diagnosis <- colindiag(df_colinear)
print(diagnosis)
#> Severe multicollinearity in the matrix! Pay attention on the variables listed bellow
#> CN = 9413786837639034
#> Matrix determinant: 0 
#> Largest correlation: PH x PH2 = 1 
#> Smallest correlation: EP x NKE = 0.233 
#> Number of VIFs > 10: 0 
#> Number of correlations with r >= |0.8|: 1 
#> Variables with largest weight in the last eigenvalues: 
#> PH2 > PH > KW > NKE > EP

# removing the collinear trait PH2
# option 1
pc3 <- 
  df_colinear %>% 
  remove_cols(PH2) %>% 
  path_coeff(resp = KW)
#> Weak multicollinearity. 
#> Condition Number: 6.119
#> You will probably have path coefficients close to being unbiased.
plot(pc3)

# option 2
pc4 <- path_coeff(df_colinear, resp = KW, pred = -PH2)
#> Weak multicollinearity. 
#> Condition Number: 6.119
#> You will probably have path coefficients close to being unbiased.
plot(pc4)

Created on 2021-12-30 by the reprex package (v2.0.1)

drschowdary commented 2 years ago

Thank you, sir

On Thu, Dec 30, 2021 at 5:22 PM TiagoOlivoto @.***> wrote:

@drschowdary https://github.com/drschowdary, I just changed the method of matrix inversion so that a pseudo-inverse is computed even with a singular matrix. A warning will alert you about this problem. To get this new feature, install the development version of metan with

devtools::install_github("TiagoOlivoto/metan")

After installing the development version, you'll be able to run the following examples. There are two main ways of dealing with multicollinearity (see our paper https://acsess.onlinelibrary.wiley.com/doi/10.2134/agronj2016.04.0196 for more info). The first (and that I'd suggest) is removing the traits that are being the main cause of the problem. The second is including a correction factor (k) in the diagonal of the correlation matrix.

library(metan)#> Registered S3 method overwritten by 'GGally':#> method from #> +.gg ggplot2#> |=========================================================|#> | Multi-Environment Trial Analysis (metan) v1.16.0 |#> | Author: Tiago Olivoto |#> | Type 'citation('metan')' to know how to cite metan |#> | Type 'vignette('metan_start')' for a short tutorial |#> | Visit 'https://bit.ly/pkgmetan' for a complete tutorial |#> |=========================================================| df_colinear <- data_ge2 %>% select(PH, EP, KW, NKE) %>% mutate(PH2 = PH)

now it runs with a warningpc1 <- path_coeff(df_colinear, resp = KW)#> Warning: System is computationally singular. Check the collinearity of#> predictors with metan::colindiag().#> Severe multicollinearity. #> Condition Number: 50796137447998448#> Consider using a correction factor with 'correction' argument.#> Consider identifying collinear traits with non_collinear_vars()

plot(pc1)

https://camo.githubusercontent.com/24589afd8c1fb003ffe424dfa3f58d5a3c100d3e2d73182ecbffa0f217dcc8db/68747470733a2f2f692e696d6775722e636f6d2f76304e67756b4f2e706e67

include a correction factor, say, 0.05pc2 <- path_coeff(df_colinear, resp = KW, correction = 0.05)#> Weak multicollinearity. #> Condition Number: 56.782#> You will probably have path coefficients close to being unbiased.

plot(pc2)

https://camo.githubusercontent.com/8c94c2f6830a856b76e7a7fcc66a62b787f64cfe57d071b5370f3c573e89233c/68747470733a2f2f692e696d6775722e636f6d2f43594d43596f612e706e67

identifying colinear traitsdiagnosis <- colindiag(df_colinear)

print(diagnosis)#> Severe multicollinearity in the matrix! Pay attention on the variables listed bellow#> CN = 9413786837639034#> Matrix determinant: 0 #> Largest correlation: PH x PH2 = 1 #> Smallest correlation: EP x NKE = 0.233 #> Number of VIFs > 10: 0 #> Number of correlations with r >= |0.8|: 1 #> Variables with largest weight in the last eigenvalues: #> PH2 > PH > KW > NKE > EP

removing the collinear trait PH2# option 1pc3 <-

df_colinear %>% remove_cols(PH2) %>% path_coeff(resp = KW)#> Weak multicollinearity. #> Condition Number: 6.119#> You will probably have path coefficients close to being unbiased. plot(pc3)

https://camo.githubusercontent.com/4ec35e2789c0a7ee820adaef1319338308fef4c3c1de463d0095646396d664d8/68747470733a2f2f692e696d6775722e636f6d2f686d66425965432e706e67

option 2pc4 <- path_coeff(df_colinear, resp = KW, pred = -PH2)#> Weak multicollinearity. #> Condition Number: 6.119#> You will probably have path coefficients close to being unbiased.

plot(pc4)

https://camo.githubusercontent.com/a8c31c09cd2fd049ffb12a83903d7b02156190acefe1d62cde76b9d77e1ae03c/68747470733a2f2f692e696d6775722e636f6d2f743243385363332e706e67

Created on 2021-12-30 by the reprex package https://reprex.tidyverse.org (v2.0.1)

— Reply to this email directly, view it on GitHub https://github.com/TiagoOlivoto/metan/issues/15#issuecomment-1002996011, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXB5YQWPCYTWNS2PYDEYJJLUTRBYZANCNFSM5K2BJLVQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

-- with warm regards *D. Rajasekhar[image: Flag India animated gif 120x90] PhD Scholar [Genetics & Plant Breeding* (SRF)] School of Crop Improvement, College of Post Graduate Studies in Agricultural Sciences, Central Agricultural University (Imphal) Umiam, Meghalaya-793103 📱: +919491233510