thibautjombart / adegenet

adegenet: a R package for the multivariate analysis of genetic markers
167 stars 64 forks source link

Strata problem #360

Open laiajulianamu opened 6 months ago

laiajulianamu commented 6 months ago

Hi I am trying to create a strata for running an AMOVA. I am using a Genlight object but for some reason I am having the error below. Can somebody help me, please?

strata(gl) <- data.frame(other(gl))
Starting gl.read.dart 
Starting utils.read.dart 
  Topskip not provided.
 Setting topskip to 6 .
  Reading in the SNP data
  Detected 2 row format.
Number of rows per clone (should be only  2 s): 2 
 Added the following locus metrics:
AlleleID CloneID AlleleSequence TrimmedSequence SNP SnpPosition CallRate OneRatioRef OneRatioSnp FreqHomRef FreqHomSnp FreqHets PICRef PICSnp AvgPIC AvgCountRef AvgCountSnp RepAvg .
Recognised: 479 individuals and 76246  SNPs in a 2 row format using Report_DTu24-9051_2_moreOrders_SNP_1.csv 
Completed: utils.read.dart 
Starting utils.dart2genlight 
Starting conversion....
Format is 2 rows.
Please note conversion of bigger data sets will take some time!
Once finished, we recommend to save the object using save(object, file="object.rdata")
Adding individual metrics: meta2.csv .
  Ids for individual metadata (at least a subset of) are matching!
  Found  479 matching ids out of 479 ids provided in the ind.metadata file.
  Added population assignments.
  Added latlon data.
 Added  id  to the other$ind.metrics slot.
  Added  pop  to the other$ind.metrics slot.
  Added  lat  to the other$ind.metrics slot.
  Added  lon  to the other$ind.metrics slot.
Completed: utils.dart2genlight 
   479 rows and 76246 columns of data read
  Read depth calculated and added to the locus metrics
  Minor Allele Frequency (MAF) calculated and added to the locus metrics
  Recalculating locus metrics provided by DArT (optionally specified)
Starting gl.compliance.check 
  Processing genlight object with SNP data
  Checking coding of SNPs
    SNP data scored NA, 0, 1 or 2 confirmed
  Checking for population assignments
    Population assignments confirmed
  Checking locus metrics and flags
  Recalculating locus metrics
  Checking for monomorphic loci
    No monomorphic loci detected
  Checking for loci with all missing data
    No loci with all missing data detected
  Checking whether individual names are unique.
  Checking for individual metrics
    Individual metrics confirmed
  Spelling of coordinates checked and changed if necessary to 
            lat/lon
Completed: gl.compliance.check 
Completed: gl.read.dart 
Error in utils.check.datatype(x, verbose = verbose) : 
  Fatal Error: inappropriate object passed to function, found list expecting genlight or SNP or SilicoDArT
> strata(gi) <- data.frame(other(gi))
Starting gl.read.dart 
Starting utils.read.dart 
  Topskip not provided.
 Setting topskip to 6 .
  Reading in the SNP data
  Detected 2 row format.
Number of rows per clone (should be only  2 s): 2 
 Added the following locus metrics:
AlleleID CloneID AlleleSequence TrimmedSequence SNP SnpPosition CallRate OneRatioRef OneRatioSnp FreqHomRef FreqHomSnp FreqHets PICRef PICSnp AvgPIC AvgCountRef AvgCountSnp RepAvg .
Recognised: 479 individuals and 76246  SNPs in a 2 row format using Report_DTu24-9051_2_moreOrders_SNP_1.csv 
Completed: utils.read.dart 
Starting utils.dart2genlight 
Starting conversion....
Format is 2 rows.
Please note conversion of bigger data sets will take some time!
Once finished, we recommend to save the object using save(object, file="object.rdata")
Adding individual metrics: meta2.csv .
  Ids for individual metadata (at least a subset of) are matching!
  Found  479 matching ids out of 479 ids provided in the ind.metadata file.
  Added population assignments.
  Added latlon data.
 Added  id  to the other$ind.metrics slot.
  Added  pop  to the other$ind.metrics slot.
  Added  lat  to the other$ind.metrics slot.
  Added  lon  to the other$ind.metrics slot.
Completed: utils.dart2genlight 
   479 rows and 76246 columns of data read
  Read depth calculated and added to the locus metrics
  Minor Allele Frequency (MAF) calculated and added to the locus metrics
  Recalculating locus metrics provided by DArT (optionally specified)
Starting gl.compliance.check 
  Processing genlight object with SNP data
  Checking coding of SNPs
    SNP data scored NA, 0, 1 or 2 confirmed
  Checking for population assignments
    Population assignments confirmed
  Checking locus metrics and flags
  Recalculating locus metrics
  Checking for monomorphic loci
    No monomorphic loci detected
  Checking for loci with all missing data
    No loci with all missing data detected
  Checking whether individual names are unique.
  Checking for individual metrics
    Individual metrics confirmed
  Spelling of coordinates checked and changed if necessary to 
            lat/lon
Completed: gl.compliance.check 
Completed: gl.read.dart 
Error in utils.check.datatype(x, verbose = verbose) : 
  Fatal Error: inappropriate object passed to function, found list expecting genlight or SNP or SilicoDArT
zkamvar commented 6 months ago

I believe this might be a problem from the {dartR} package maintained by @green-striped-gecko (who might be able to suggest a better forum for this question) because none of the output shown or errors exist in {adegenet}.

It looks like the error is coming from a function in {dartR} called utils.check.datatype.

It would be helpful if you could provide any of the following:

  1. the code for how you created the gl object
  2. the output of sessionInfo()
  3. the output of class(gl)
  4. the output of class(other(gl))
laiajulianamu commented 6 months ago

Hi Zhian, thank you for your response. Below are the answers:

  1. genlight object

    gl <- gl.read.dart( filename="Report_DTu24-9051_2_moreOrders_SNP_1.csv", ind.metafile="meta2.csv")

  2. R version 4.3.2 (2023-10-31 ucrt)

Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 11 x64 (build 22631)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United States.utf8 [4] LC_NUMERIC=C LC_TIME=English_United States.utf8

time zone: America/Chicago tzcode source: internal

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] poppr_2.9.6 pegas_1.3 ape_5.7-1 dartR.captive_0.90 dartR.spatial_0.89 dartR.popgen_0.88 dartR.sim_0.89 [8] dartRverse_0.51 dartR.base_0.88 dartR.data_1.0.4 dplyr_1.1.4 ggplot2_3.5.0.9000 adegenet_2.1.10 ade4_1.7-22

loaded via a namespace (and not attached): [1] DBI_1.2.2 gdsfmt_1.38.0 gridExtra_2.3 polysat_1.7-7 remotes_2.5.0 permute_0.9-7 [7] rlang_1.1.3 magrittr_2.0.3 e1071_1.7-14 compiler_4.3.2 mgcv_1.9-0 maps_3.4.2 [13] vctrs_0.6.5 reshape2_1.4.4 stringr_1.5.1 profvis_0.3.8 pkgconfig_2.0.3 crayon_1.5.2 [19] SNPRelate_1.36.1 fastmap_1.1.1 ellipsis_0.3.2 utf8_1.2.4 promises_1.2.1 sessioninfo_1.2.2 [25] purrr_1.0.2 cachem_1.0.8 seqinr_4.2-36 pak_0.7.2 later_1.3.2 terra_1.7-71 [31] parallel_4.3.2 cluster_2.1.6 R6_2.5.1 stringi_1.8.3 boot_1.3-30 pkgload_1.3.4 [37] gdistance_1.6.4 Rcpp_1.0.12 iterators_1.0.14 fields_15.2 usethis_2.2.3 httpuv_1.6.14 [43] Matrix_1.6-1.1 splines_4.3.2 igraph_2.0.2 tidyselect_1.2.1 rstudioapi_0.16.0 vegan_2.6-4 [49] doParallel_1.0.17 codetools_0.2-20 miniUI_0.1.1.1 pkgbuild_1.4.4 lattice_0.21-9 tibble_3.2.1 [55] plyr_1.8.9 shiny_1.8.1.1 withr_3.0.0 sf_1.0-16 units_0.8-5 proxy_0.4-27 [61] urlchecker_1.0.1 xml2_1.3.6 pillar_1.9.0 BiocManager_1.30.22 KernSmooth_2.23-22 foreach_1.5.2 [67] generics_0.1.3 sp_2.1-3 hierfstat_0.5-11 munsell_0.5.1 scales_1.3.0 xtable_1.8-4 [73] class_7.3-22 glue_1.7.0 tools_4.3.2 radiator_1.3.0 data.table_1.15.0 dotCall64_1.1-1 [79] fs_1.6.3 melfuR_1.1 grid_4.3.2 tidyr_1.3.1 devtools_2.4.5 colorspace_2.1-0 [85] nlme_3.1-163 patchwork_1.2.0 raster_3.6-26 StAMPP_1.6.3 LEA_3.14.0 cli_3.6.2 [91] spam_2.10-0 fansi_1.0.6 viridisLite_0.4.2 ggdendro_0.2.0 gtable_0.3.4 digest_0.6.35 [97] classInt_0.4-10 htmlwidgets_1.6.4 memoise_2.0.1 htmltools_0.5.7 lifecycle_1.0.4 dismo_1.3-14 [103] mime_0.12 MASS_7.3-60

class(gl)[1] "genlight" attr(,"package") [1] "adegenet"

class(other(gl))[1] "list"

On Fri, 12 Apr 2024 at 13:04, Zhian N. Kamvar @.***> wrote:

I believe this might be a problem from the {dartR} package maintained by @green-striped-gecko https://github.com/green-striped-gecko (who might be able to suggest a better forum for this question) because none of the output shown or errors exist in {adegenet}.

It looks like the error is coming from a function in {dartR} called utils.check.datatype.

It would be helpful if you could provide any of the following:

  1. the code for how you created the gl object
  2. the output of sessionInfo()
  3. the output of class(gl)
  4. the output of class(other(gl))

— Reply to this email directly, view it on GitHub https://github.com/thibautjombart/adegenet/issues/360#issuecomment-2052229432, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALV6WJNQW2CJEK42NIILHBTY5AOZZAVCNFSM6AAAAABGEN5W66VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJSGIZDSNBTGI . You are receiving this because you authored the thread.Message ID: @.***>

zkamvar commented 6 months ago

Thank you for providing the code, @laiajulianamu. Looking into the dartR.base::gl.read.dart() function, I think your problem is that the contents of the @other slot is not a data frame of population strata. That function calculates locus-based statistics and strata() assumes individual based population assignments.

However, digging a little further, it looks like a metadata file is required and any column that couldn't be assigned to individual ID, population, or lat/lon is placed in @other$ind.metrics. Perhaps your solution is this if you had columns of strata in "meta2.csv"

strata(gl) <- other(gl)$ind.metrics

@green-striped-gecko, please let me know if my assumptions are correct.

zkamvar commented 6 months ago

Checking in to see if the solution has worked for you @laiajulianamu

laiajulianamu commented 6 months ago

Hi Zhian,

No, I was not able to run it :(

On Sun, 5 May 2024 at 15:19, Zhian N. Kamvar @.***> wrote:

Checking in to see if the solution has worked for you @laiajulianamu https://github.com/laiajulianamu

— Reply to this email directly, view it on GitHub https://github.com/thibautjombart/adegenet/issues/360#issuecomment-2094935916, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALV6WJOHYEY7V6VLAJEEO6TZA2H53AVCNFSM6AAAAABGEN5W66VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJUHEZTKOJRGY . You are receiving this because you were mentioned.Message ID: @.***>

zkamvar commented 6 months ago

@green-striped-gecko, since this originates from the dartR suite, would you be able to provide some insight?

@laiajulianamu Do you get the same error or a different error after you run the commands?

laiajulianamu commented 6 months ago

I am having the same error. Thank you for asking. Best, Laia

On Mon, 6 May 2024 at 11:24, Zhian N. Kamvar @.***> wrote:

@green-striped-gecko https://github.com/green-striped-gecko, since this originates from the dartR suite, would you be able to provide some insight?

@laiajulianamu https://github.com/laiajulianamu Do you get the same error or a different error after you run the commands?

— Reply to this email directly, view it on GitHub https://github.com/thibautjombart/adegenet/issues/360#issuecomment-2096435306, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALV6WJL6NUGFQHHLIAAU7NTZA6VD7AVCNFSM6AAAAABGEN5W66VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJWGQZTKMZQGY . You are receiving this because you were mentioned.Message ID: @.***>