TiagoOlivoto / metan

Package for multi-environment trial analysis
https://tiagoolivoto.github.io/metan/
GNU General Public License v3.0
35 stars 17 forks source link

error when I run ge_stat() code of metan package #21

Closed nivelive closed 1 year ago

nivelive commented 1 year ago
setwd("D:/PhD/Ragi")
library(readxl)
library(metan)
#> Registered S3 method overwritten by 'GGally':
#>   method from   
#>   +.gg   ggplot2
#> |=========================================================|
#> | Multi-Environment Trial Analysis (metan) v1.17.0        |
#> | Author: Tiago Olivoto                                   |
#> | Type 'citation('metan')' to know how to cite metan      |
#> | Type 'vignette('metan_start')' for a short tutorial     |
#> | Visit 'https://bit.ly/pkgmetan' for a complete tutorial |
#> |=========================================================|
Metan_analysis <- read_excel("Metan_analysis.xlsx", 
                             col_types = c("text", "text", "text", 
                                           "numeric", "numeric", "numeric", 
                                           "numeric", "numeric", "numeric", 
                                           "numeric", "numeric", "numeric", 
                                           "numeric"))
View(Metan_analysis)
stab <- ge_stats(Metan_analysis, ENV, GEN, REP)
#> Error in rowSums((gamma.n)^2): 'x' must be an array of at least two dimensions

Created on 2023-02-01 with reprex v2.0.2

TiagoOlivoto commented 1 year ago

@nivelive please, provide a reproducible example so that I can reproduce the error on my side.

nivelive commented 1 year ago
setwd("D:/PhD/Ragi")
library(readxl)
library(metan)
#> Registered S3 method overwritten by 'GGally':
#>   method from   
#>   +.gg   ggplot2
#> |=========================================================|
#> | Multi-Environment Trial Analysis (metan) v1.17.0        |
#> | Author: Tiago Olivoto                                   |
#> | Type 'citation('metan')' to know how to cite metan      |
#> | Type 'vignette('metan_start')' for a short tutorial     |
#> | Visit 'https://bit.ly/pkgmetan' for a complete tutorial |
#> |=========================================================|
library(reprex)
Metan_analysis <- read_excel("Metan_analysis.xlsx", 
                             col_types = c("text", "text", "text", 
                                           "numeric", "numeric", "numeric", 
                                           "numeric", "numeric", "numeric", 
                                           "numeric", "numeric", "numeric", 
                                           "numeric"))
View(Metan_analysis)
Metan_analysis$ENV <- as.factor(Metan_analysis$ENV)
Metan_analysis$GEN <- as.factor(Metan_analysis$GEN)
Metan_analysis$REP<- as.factor(Metan_analysis$REP)
inspect(Metan_analysis)
#> # A tibble: 13 x 10
#>    Variable Class   Missing Levels Valid_n   Min Median    Max Outlier Text 
#>    <chr>    <chr>   <chr>   <chr>    <int> <dbl>  <dbl>  <dbl>   <dbl> <lgl>
#>  1 ENV      factor  No      3          252 NA     NA     NA         NA NA   
#>  2 GEN      factor  No      28         252 NA     NA     NA         NA NA   
#>  3 REP      factor  No      3          252 NA     NA     NA         NA NA   
#>  4 DF       numeric No      -          252 62     75     91          0 NA   
#>  5 FLBL     numeric No      -          252 24.2   39.0   56.3        2 NA   
#>  6 FLBW     numeric No      -          252  0.37   0.92   1.63       0 NA   
#>  7 PL       numeric No      -          252 17.1   29.9   41.9        1 NA   
#>  8 PH       numeric No      -          252 66.5   92.6  147          0 NA   
#>  9 FL       numeric No      -          252  3.07   5.57  10.7        1 NA   
#> 10 FW       numeric No      -          252  0.27   0.87   1.6        0 NA   
#> 11 EHL      numeric No      -          252  3.1    6.53  11.9        0 NA   
#> 12 FNME     numeric No      -          252  2.67   6.67  11.3        0 NA   
#> 13 Y        numeric No      -          252  0.25   1.22   2.61       0 NA
#> Warning: Possible outliers in variable(s) FLBL, PL, FL. Use 'find_outliers()'
#> for more details.
stab <- ge_stats(Metan_analysis, ENV, GEN, REP)
#> Error in rowSums((gamma.n)^2): 'x' must be an array of at least two dimensions

Created on 2023-02-03 with reprex v2.0.2

TiagoOlivoto commented 1 year ago

@nivelive In my side, with the example data data_ge, no error is observed. Without Metan_analysis I can't reproduce the error.

library(metan)
#> Registered S3 method overwritten by 'GGally':
#>   method from   
#>   +.gg   ggplot2
#> |=========================================================|
#> | Multi-Environment Trial Analysis (metan) v1.17.0        |
#> | Author: Tiago Olivoto                                   |
#> | Type 'citation('metan')' to know how to cite metan      |
#> | Type 'vignette('metan_start')' for a short tutorial     |
#> | Visit 'https://bit.ly/pkgmetan' for a complete tutorial |
#> |=========================================================|
args(ge_stats)
#> function (.data, env, gen, rep, resp, verbose = TRUE, prob = 0.05) 
#> NULL
# by default, all numeric variables are analyzed.
# So, the argument 'resp' is not mandatory
model <- ge_stats(data_ge, ENV, GEN, REP)
#> Evaluating trait GY |======================                      | 50% 00:00:09 Evaluating trait HM |============================================| 100% 00:00:20 

Created on 2023-02-08 with reprex v2.0.2

nivelive commented 1 year ago

I tried making it into a reproducible example using reprex() but I am not understanding why is it not working. can you please tell me what mistake I m doing so that I can send you the proper reproducible example. I have referred Jenifer Bryan video and as per her instructions I tried to reprex it.

nivelive commented 1 year ago
setwd("D:/PhD/Ragi")
library(metan)
#> Registered S3 method overwritten by 'GGally':
#>   method from   
#>   +.gg   ggplot2
#> |=========================================================|
#> | Multi-Environment Trial Analysis (metan) v1.17.0        |
#> | Author: Tiago Olivoto                                   |
#> | Type 'citation('metan')' to know how to cite metan      |
#> | Type 'vignette('metan_start')' for a short tutorial     |
#> | Visit 'https://bit.ly/pkgmetan' for a complete tutorial |
#> |=========================================================|
library(readxl)
Metan_analysis <- read_excel("Metan_analysis.xlsx", 
                             col_types = c("text", "text", "text", 
                "numeric", "numeric", "numeric", 
                                    "numeric", "numeric", "numeric", 
                                        "numeric", "numeric", "numeric", 
                                            "numeric"))
Metan_analysis$REP <- as.factor(Metan_analysis$REP)
Metan_analysis$GEN <- as.factor(Metan_analysis$GEN)
Metan_analysis$ENV <- as.factor(Metan_analysis$ENV)
inspect(Metan_analysis, plot = TRUE, threshold = 28)
#> # A tibble: 13 x 10
#>    Variable Class   Missing Levels Valid_n   Min Median    Max Outlier Text 
#>    <chr>    <chr>   <chr>   <chr>    <int> <dbl>  <dbl>  <dbl>   <dbl> <lgl>
#>  1 ENV      factor  No      3          252 NA     NA     NA         NA NA   
#>  2 GEN      factor  No      28         252 NA     NA     NA         NA NA   
#>  3 REP      factor  No      3          252 NA     NA     NA         NA NA   
#>  4 DF       numeric No      -          252 62     75     91          0 NA   
#>  5 FLBL     numeric No      -          252 24.2   39.0   56.3        2 NA   
#>  6 FLBW     numeric No      -          252  0.37   0.92   1.63       0 NA   
#>  7 PL       numeric No      -          252 17.1   29.9   41.9        1 NA   
#>  8 PH       numeric No      -          252 66.5   92.6  147          0 NA   
#>  9 FL       numeric No      -          252  3.07   5.57  10.7        1 NA   
#> 10 FW       numeric No      -          252  0.27   0.87   1.6        0 NA   
#> 11 EHL      numeric No      -          252  3.1    6.53  11.9        0 NA   
#> 12 FNME     numeric No      -          252  2.67   6.67  11.3        0 NA   
#> 13 Y        numeric No      -          252  0.25   1.22   2.61       0 NA
#> Warning: Possible outliers in variable(s) FLBL, PL, FL. Use 'find_outliers()'
#> for more details.

descstat <- desc_stat(Metan_analysis, stats = "all")
anova<- anova_ind(Metan_analysis, env= ENV, gen = GEN, rep = REP)
#> Evaluating trait DF |====                                        | 10% 00:00:00 Evaluating trait FLBL |========                                  | 20% 00:00:00 Evaluating trait FLBW |=============                             | 30% 00:00:00 Evaluating trait PL |==================                          | 40% 00:00:00 Evaluating trait PH |======================                      | 50% 00:00:00 Evaluating trait FL |==========================                  | 60% 00:00:00 Evaluating trait FW |===============================             | 70% 00:00:00 Evaluating trait EHL |==================================         | 80% 00:00:00 Evaluating trait FNME |======================================    | 90% 00:00:00 Evaluating trait Y |=============================================| 100% 00:00:00 
anova_joint <- anova_joint(Metan_analysis, env= ENV, gen = GEN, rep = REP)
#> variable DF 
#> ---------------------------------------------------------------------------
#> Joint ANOVA table
#> ---------------------------------------------------------------------------
#>     Source     Df  Sum Sq Mean Sq F value    Pr(>F)
#>        ENV   2.00 11353.7 5676.86 1788.21 3.83e-111
#>   REP(ENV)   6.00    21.7    3.62    1.14  3.42e-01
#>        GEN  27.00  3069.6  113.69   35.81  8.47e-55
#>    GEN:ENV  54.00   753.8   13.96    4.40  1.59e-13
#>  Residuals 162.00   514.3    3.17      NA        NA
#>      CV(%)   2.33      NA      NA      NA        NA
#>  MSR+/MSR-  14.87      NA      NA      NA        NA
#>     OVmean  76.38      NA      NA      NA        NA
#> ---------------------------------------------------------------------------
#> 
#> variable FLBL 
#> ---------------------------------------------------------------------------
#> Joint ANOVA table
#> ---------------------------------------------------------------------------
#>     Source     Df Sum Sq Mean Sq F value   Pr(>F)
#>        ENV   2.00   1114   557.1   38.14 2.67e-14
#>   REP(ENV)   6.00    612   101.9    6.98 1.29e-06
#>        GEN  27.00   2539    94.1    6.44 7.87e-15
#>    GEN:ENV  54.00    957    17.7    1.21 1.79e-01
#>  Residuals 162.00   2366    14.6      NA       NA
#>      CV(%)   9.90     NA      NA      NA       NA
#>  MSR+/MSR-   1.86     NA      NA      NA       NA
#>     OVmean  38.59     NA      NA      NA       NA
#> ---------------------------------------------------------------------------
#> 
#> variable FLBW 
#> ---------------------------------------------------------------------------
#> Joint ANOVA table
#> ---------------------------------------------------------------------------
#>     Source      Df Sum Sq Mean Sq F value   Pr(>F)
#>        ENV   2.000 10.066   5.033  360.46 2.24e-60
#>   REP(ENV)   6.000  0.622   0.104    7.43 4.88e-07
#>        GEN  27.000  5.356   0.198   14.21 2.84e-30
#>    GEN:ENV  54.000  2.914   0.054    3.86 1.71e-11
#>  Residuals 162.000  2.262   0.014      NA       NA
#>      CV(%)  12.771     NA      NA      NA       NA
#>  MSR+/MSR-   2.267     NA      NA      NA       NA
#>     OVmean   0.925     NA      NA      NA       NA
#> ---------------------------------------------------------------------------
#> 
#> variable PL 
#> ---------------------------------------------------------------------------
#> Joint ANOVA table
#> ---------------------------------------------------------------------------
#>     Source     Df Sum Sq Mean Sq F value   Pr(>F)
#>        ENV   2.00 2296.2 1148.12 194.134 9.64e-44
#>   REP(ENV)   6.00   21.1    3.52   0.595 7.34e-01
#>        GEN  27.00  722.3   26.75   4.524 6.31e-10
#>    GEN:ENV  54.00  674.8   12.50   2.113 1.71e-04
#>  Residuals 162.00  958.1    5.91      NA       NA
#>      CV(%)   8.25     NA      NA      NA       NA
#>  MSR+/MSR-   3.10     NA      NA      NA       NA
#>     OVmean  29.46     NA      NA      NA       NA
#> ---------------------------------------------------------------------------
#> 
#> variable PH 
#> ---------------------------------------------------------------------------
#> Joint ANOVA table
#> ---------------------------------------------------------------------------
#>     Source     Df Sum Sq Mean Sq F value   Pr(>F)
#>        ENV   2.00  60360 30180.0  508.06 1.60e-70
#>   REP(ENV)   6.00   1509   251.4    4.23 5.50e-04
#>        GEN  27.00   5466   202.4    3.41 7.05e-07
#>    GEN:ENV  54.00   3218    59.6    1.00 4.80e-01
#>  Residuals 162.00   9623    59.4      NA       NA
#>      CV(%)   7.87     NA      NA      NA       NA
#>  MSR+/MSR-   2.51     NA      NA      NA       NA
#>     OVmean  97.94     NA      NA      NA       NA
#> ---------------------------------------------------------------------------
#> 
#> variable FL 
#> ---------------------------------------------------------------------------
#> Joint ANOVA table
#> ---------------------------------------------------------------------------
#>     Source     Df Sum Sq Mean Sq F value   Pr(>F)
#>        ENV   2.00 343.54 171.770  500.85 4.34e-70
#>   REP(ENV)   6.00   6.07   1.012    2.95 9.23e-03
#>        GEN  27.00 198.95   7.368   21.48 1.58e-40
#>    GEN:ENV  54.00  37.81   0.700    2.04 3.23e-04
#>  Residuals 162.00  55.56   0.343      NA       NA
#>      CV(%)   9.96     NA      NA      NA       NA
#>  MSR+/MSR-   2.14     NA      NA      NA       NA
#>     OVmean   5.88     NA      NA      NA       NA
#> ---------------------------------------------------------------------------
#> 
#> variable FW 
#> ---------------------------------------------------------------------------
#> Joint ANOVA table
#> ---------------------------------------------------------------------------
#>     Source      Df Sum Sq Mean Sq F value   Pr(>F)
#>        ENV   2.000 10.207  5.1033  378.52 8.73e-62
#>   REP(ENV)   6.000  0.492  0.0820    6.08 9.11e-06
#>        GEN  27.000  2.399  0.0888    6.59 3.41e-15
#>    GEN:ENV  54.000  1.870  0.0346    2.57 2.63e-06
#>  Residuals 162.000  2.184  0.0135      NA       NA
#>      CV(%)  14.450     NA      NA      NA       NA
#>  MSR+/MSR-   3.611     NA      NA      NA       NA
#>     OVmean   0.804     NA      NA      NA       NA
#> ---------------------------------------------------------------------------
#> 
#> variable EHL 
#> ---------------------------------------------------------------------------
#> Joint ANOVA table
#> ---------------------------------------------------------------------------
#>     Source     Df  Sum Sq Mean Sq F value   Pr(>F)
#>        ENV   2.00 802.870 401.435 845.198 1.92e-86
#>   REP(ENV)   6.00   0.742   0.124   0.261 9.54e-01
#>        GEN  27.00 262.922   9.738  20.502 2.65e-39
#>    GEN:ENV  54.00  77.340   1.432   3.015 4.06e-08
#>  Residuals 162.00  76.943   0.475      NA       NA
#>      CV(%)  10.12      NA      NA      NA       NA
#>  MSR+/MSR-   2.63      NA      NA      NA       NA
#>     OVmean   6.81      NA      NA      NA       NA
#> ---------------------------------------------------------------------------
#> 
#> variable FNME 
#> ---------------------------------------------------------------------------
#> Joint ANOVA table
#> ---------------------------------------------------------------------------
#>     Source     Df Sum Sq Mean Sq F value   Pr(>F)
#>        ENV   2.00 416.49 208.244  353.63 7.93e-60
#>   REP(ENV)   6.00   6.23   1.039    1.76 1.10e-01
#>        GEN  27.00  62.95   2.331    3.96 2.14e-08
#>    GEN:ENV  54.00  68.45   1.268    2.15 1.20e-04
#>  Residuals 162.00  95.40   0.589      NA       NA
#>      CV(%)  11.61     NA      NA      NA       NA
#>  MSR+/MSR-   2.44     NA      NA      NA       NA
#>     OVmean   6.61     NA      NA      NA       NA
#> ---------------------------------------------------------------------------
#> 
#> variable Y 
#> ---------------------------------------------------------------------------
#> Joint ANOVA table
#> ---------------------------------------------------------------------------
#>     Source     Df Sum Sq Mean Sq F value   Pr(>F)
#>        ENV   2.00  35.90 17.9503  315.21 1.43e-56
#>   REP(ENV)   6.00   1.90  0.3168    5.56 2.87e-05
#>        GEN  27.00  13.28  0.4919    8.64 7.29e-20
#>    GEN:ENV  54.00   7.14  0.1323    2.32 2.55e-05
#>  Residuals 162.00   9.23  0.0569      NA       NA
#>      CV(%)  18.60     NA      NA      NA       NA
#>  MSR+/MSR-   1.98     NA      NA      NA       NA
#>     OVmean   1.28     NA      NA      NA       NA
#> ---------------------------------------------------------------------------
#> 
#> ---------------------------------------------------------------------------
#> Variables with nonsignificant GxE interaction
#> FLBL PH 
#> ---------------------------------------------------------------------------
#> Done!
stab <- ge_stats(Metan_analysis, env= ENV, gen = GEN, rep = REP)
#> Error in rowSums((gamma.n)^2): 'x' must be an array of at least two dimensions

Created on 2023-02-08 with reprex v2.0.2

TiagoOlivoto commented 1 year ago

In mixed-effect models (e.g., WAASB), traits with p-values equal to 1 for the GEI effects can generate such an error. I suggest you try to run one trait at once and see which one generates the error. It seems that such an error is being caused by a trait with non-significant GEI effects (try removing FLBL and PH) and run again. Since I don't have the raw data to check this out, It is difficult to say the exact source of the error.

nivelive commented 1 year ago

as per your suggestion I analyzed each trait separately and I could get results for only two traits out of 10. Is this means other traits have non significant GXE interaction?

TiagoOlivoto commented 1 year ago

@nivelive If GXE interaction for one trait is 'highly' non-significant (p-value > 0.8, for example) it means that the response of genotypes is very likely to be not dependent on the environmental condition, but this must be interpreted with caution. For example, within 60-70 genotypes, a few (say, 2-3) may have different responses and interact with environments. So, it is always a good choice, to explore the dataset.

Since I don't have the raw data to find the source of the error and this seems to be not a clear bug, I'm closing this now. Feel free to re-open it if you find it necessary.