thierrygosselin / radiator

RADseq Data Exploration, Manipulation and Visualization using R
https://thierrygosselin.github.io/radiator/
GNU General Public License v3.0
58 stars 23 forks source link

SeqArray newest version: unused argument (.progress = TRUE) #159

Closed lindington closed 1 year ago

lindington commented 2 years ago

Hi Thierry,

I am trying to load a vcf file into radiator, ultimately to convert into hzar input format. I think there is an issue with the argument .progress=True in the SeqArray functions used in the radiator code. This argument seems to be changed to verbose in the newest SeqArray version.

From the SeqArray Vignette:

Arguments

My error: When using tidy_vcf(data="file.vcf", parallel.core=1, output="file.hzar", verbose = TRUE) or genomic_converter("file.vcf",output = "file.hzar", verbose = TRUE)

I get the following error:


Data summary: 
    number of samples: 144
    number of markers: 9828

Filter monomorphic markers
Number of individuals / strata / chrom / locus / SNP:
    Blacklisted: 0 / 0 / 0 / 0 / 2434
Filter common markers: only 1 strata, returning data

Generating individual stats...
Generating markers stats...
Error in SeqArray::seqAlleleCount(gdsfile = gds, ref.allele = NULL, .progress = TRUE,  : 
  unused argument (.progress = TRUE)

Computation time, overall: 8 sec

My session info:


> devtools::session_info()
─ Session info ──────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.2.0 (2022-04-22 ucrt)
 os       Windows 10 x64 (build 19043)
 system   x86_64, mingw32
 ui       RStudio
 language (EN)
 collate  English_United Kingdom.utf8
 ctype    English_United Kingdom.utf8
 tz       Europe/Prague
 date     2022-04-26
 rstudio  2022.02.1+461 Prairie Trillium (desktop)
 pandoc   NA

─ Packages ──────────────────────────────────────────────────────────────────────────────────────────────
 package          * version  date (UTC) lib source
 abind              1.4-5    2016-07-21 [1] CRAN (R 4.2.0)
 ade4               1.7-19   2022-04-19 [1] CRAN (R 4.2.0)
 adegenet           2.1.5    2021-10-09 [1] CRAN (R 4.2.0)
 adegraphics        1.0-16   2021-09-16 [1] CRAN (R 4.2.0)
 adephylo           1.1-11   2017-12-18 [1] CRAN (R 4.2.0)
 adespatial         0.3-16   2022-03-31 [1] CRAN (R 4.2.0)
 ape                5.6-2    2022-03-02 [1] CRAN (R 4.2.0)
 assertthat         0.2.1    2019-03-21 [1] CRAN (R 4.2.0)
 backports          1.4.1    2021-12-13 [1] CRAN (R 4.2.0)
 BiocGenerics       0.41.2   2022-02-18 [1] Bioconductor
 BiocManager        1.30.17  2022-04-22 [1] CRAN (R 4.2.0)
 Biostrings         2.63.3   2022-03-29 [1] Bioconductor
 bit                4.0.4    2020-08-04 [1] CRAN (R 4.2.0)
 bit64              4.0.5    2020-08-30 [1] CRAN (R 4.2.0)
 bitops             1.0-7    2021-04-24 [1] CRAN (R 4.2.0)
 boot               1.3-28   2021-05-03 [2] CRAN (R 4.2.0)
 brio               1.1.3    2021-11-30 [1] CRAN (R 4.2.0)
 broom              0.8.0    2022-04-13 [1] CRAN (R 4.2.0)
 cachem             1.0.6    2021-08-19 [1] CRAN (R 4.2.0)
 callr              3.7.0    2021-04-20 [1] CRAN (R 4.2.0)
 car                3.0-12   2021-11-06 [1] CRAN (R 4.2.0)
 carData            3.0-5    2022-01-06 [1] CRAN (R 4.2.0)
 class              7.3-20   2022-01-16 [2] CRAN (R 4.2.0)
 classInt           0.4-3    2020-04-07 [1] CRAN (R 4.2.0)
 cli                3.3.0    2022-04-25 [1] CRAN (R 4.2.0)
 cluster            2.1.3    2022-03-28 [2] CRAN (R 4.2.0)
 coda             * 0.19-4   2020-09-30 [1] CRAN (R 4.2.0)
 codetools          0.2-18   2020-11-04 [2] CRAN (R 4.2.0)
 colorspace         2.0-3    2022-02-21 [1] CRAN (R 4.2.0)
 cowplot            1.1.1    2020-12-30 [1] CRAN (R 4.2.0)
 crayon             1.5.1    2022-03-26 [1] CRAN (R 4.2.0)
 data.table         1.14.2   2021-09-27 [1] CRAN (R 4.2.0)
 DBI                1.1.2    2021-12-20 [1] CRAN (R 4.2.0)
 deldir             1.0-6    2021-10-23 [1] CRAN (R 4.2.0)
 desc               1.4.1    2022-03-06 [1] CRAN (R 4.2.0)
 devtools           2.4.3    2021-11-30 [1] CRAN (R 4.2.0)
 digest             0.6.29   2021-12-01 [1] CRAN (R 4.2.0)
 dplyr              1.0.8    2022-02-08 [1] CRAN (R 4.2.0)
 e1071              1.7-9    2021-09-16 [1] CRAN (R 4.2.0)
 ellipsis           0.3.2    2021-04-29 [1] CRAN (R 4.2.0)
 fansi              1.0.3    2022-03-24 [1] CRAN (R 4.2.0)
 farver             2.1.0    2021-02-28 [1] CRAN (R 4.2.0)
 fastmap            1.1.0    2021-01-25 [1] CRAN (R 4.2.0)
 foreach          * 1.5.2    2022-02-02 [1] CRAN (R 4.2.0)
 fs                 1.5.2    2021-12-08 [1] CRAN (R 4.2.0)
 gdsfmt           * 1.31.2   2022-04-07 [1] Bioconductor
 generics           0.1.2    2022-01-31 [1] CRAN (R 4.2.0)
 GenomeInfoDb       1.31.10  2022-04-21 [1] Bioconductor
 GenomeInfoDbData   1.2.8    2022-04-26 [1] Bioconductor
 GenomicRanges      1.47.6   2022-02-18 [1] Bioconductor
 ggplot2            3.3.5    2021-06-25 [1] CRAN (R 4.2.0)
 ggpubr             0.4.0    2020-06-27 [1] CRAN (R 4.2.0)
 ggsignif           0.6.3    2021-09-09 [1] CRAN (R 4.2.0)
 glue               1.6.2    2022-02-24 [1] CRAN (R 4.2.0)
 gridExtra          2.3      2017-09-09 [1] CRAN (R 4.2.0)
 grur             * 0.1.4    2022-04-26 [1] Github (thierrygosselin/grur@d31c423)
 gtable             0.3.0    2019-03-25 [1] CRAN (R 4.2.0)
 hms                1.1.1    2021-09-26 [1] CRAN (R 4.2.0)
 htmltools          0.5.2    2021-08-25 [1] CRAN (R 4.2.0)
 httpuv             1.6.5    2022-01-05 [1] CRAN (R 4.2.0)
 httr               1.4.2    2020-07-20 [1] CRAN (R 4.2.0)
 hzar             * 0.2-5    2013-09-23 [1] CRAN (R 4.2.0)
 igraph             1.3.1    2022-04-20 [1] CRAN (R 4.2.0)
 installr         * 0.23.2   2021-05-08 [1] CRAN (R 4.2.0)
 IRanges            2.29.1   2022-02-18 [1] Bioconductor
 iterators          1.0.14   2022-02-05 [1] CRAN (R 4.2.0)
 jpeg               0.1-9    2021-07-24 [1] CRAN (R 4.2.0)
 KernSmooth         2.23-20  2021-05-03 [2] CRAN (R 4.2.0)
 labeling           0.4.2    2020-10-20 [1] CRAN (R 4.2.0)
 later              1.3.0    2021-08-18 [1] CRAN (R 4.2.0)
 lattice            0.20-45  2021-09-22 [2] CRAN (R 4.2.0)
 latticeExtra       0.6-29   2019-12-19 [1] CRAN (R 4.2.0)
 lazyeval           0.2.2    2019-03-15 [1] CRAN (R 4.2.0)
 lifecycle          1.0.1    2021-09-24 [1] CRAN (R 4.2.0)
 magrittr           2.0.3    2022-03-30 [1] CRAN (R 4.2.0)
 MASS             * 7.3-56   2022-03-23 [2] CRAN (R 4.2.0)
 Matrix             1.4-1    2022-03-23 [2] CRAN (R 4.2.0)
 MatrixModels       0.5-0    2021-03-02 [1] CRAN (R 4.2.0)
 mcmc               0.9-7    2020-03-21 [1] CRAN (R 4.2.0)
 MCMCpack         * 1.6-3    2022-04-13 [1] CRAN (R 4.2.0)
 memoise            2.0.1    2021-11-26 [1] CRAN (R 4.2.0)
 mgcv               1.8-40   2022-03-29 [2] CRAN (R 4.2.0)
 mime               0.12     2021-09-28 [1] CRAN (R 4.2.0)
 munsell            0.5.0    2018-06-12 [1] CRAN (R 4.2.0)
 nlme               3.1-157  2022-03-25 [2] CRAN (R 4.2.0)
 permute            0.9-7    2022-01-27 [1] CRAN (R 4.2.0)
 phylobase          0.8.10   2020-03-01 [1] CRAN (R 4.2.0)
 pillar             1.7.0    2022-02-01 [1] CRAN (R 4.2.0)
 pkgbuild           1.3.1    2021-12-20 [1] CRAN (R 4.2.0)
 pkgconfig          2.0.3    2019-09-22 [1] CRAN (R 4.2.0)
 pkgload            1.2.4    2021-11-30 [1] CRAN (R 4.2.0)
 plyr               1.8.7    2022-03-24 [1] CRAN (R 4.2.0)
 png                0.1-7    2013-12-03 [1] CRAN (R 4.2.0)
 prettyunits        1.1.1    2020-01-24 [1] CRAN (R 4.2.0)
 processx           3.5.3    2022-03-25 [1] CRAN (R 4.2.0)
 progress           1.2.2    2019-05-16 [1] CRAN (R 4.2.0)
 promises           1.2.0.1  2021-02-11 [1] CRAN (R 4.2.0)
 proxy              0.4-26   2021-06-07 [1] CRAN (R 4.2.0)
 ps                 1.7.0    2022-04-23 [1] CRAN (R 4.2.0)
 purrr              0.3.4    2020-04-17 [1] CRAN (R 4.2.0)
 quantreg           5.88     2022-02-05 [1] CRAN (R 4.2.0)
 R6                 2.5.1    2021-08-19 [1] CRAN (R 4.2.0)
 radiator         * 1.2.2    2022-04-26 [1] Github (thierrygosselin/radiator@6efdf14)
 raster             3.5-15   2022-01-22 [1] CRAN (R 4.2.0)
 RColorBrewer       1.1-3    2022-04-03 [1] CRAN (R 4.2.0)
 Rcpp               1.0.8.3  2022-03-17 [1] CRAN (R 4.2.0)
 RCurl              1.98-1.6 2022-02-08 [1] CRAN (R 4.2.0)
 readr              2.1.2    2022-01-30 [1] CRAN (R 4.2.0)
 remotes            2.4.2    2021-11-30 [1] CRAN (R 4.2.0)
 reshape2           1.4.4    2020-04-09 [1] CRAN (R 4.2.0)
 rlang              1.0.2    2022-03-04 [1] CRAN (R 4.2.0)
 rncl               0.8.6    2022-03-18 [1] CRAN (R 4.2.0)
 RNeXML             2.4.6    2022-02-09 [1] CRAN (R 4.2.0)
 rprojroot          2.0.3    2022-04-02 [1] CRAN (R 4.2.0)
 rstatix            0.7.0    2021-02-13 [1] CRAN (R 4.2.0)
 rstudioapi         0.13     2020-11-12 [1] CRAN (R 4.2.0)
 s2                 1.0.7    2021-09-28 [1] CRAN (R 4.2.0)
 S4Vectors          0.33.17  2022-04-06 [1] Bioconductor
 scales             1.2.0    2022-04-13 [1] CRAN (R 4.2.0)
 SeqArray         * 1.35.12  2022-04-19 [1] Bioconductor
 seqinr             4.2-8    2021-06-09 [1] CRAN (R 4.2.0)
 sessioninfo        1.2.2    2021-12-06 [1] CRAN (R 4.2.0)
 sf                 1.0-7    2022-03-07 [1] CRAN (R 4.2.0)
 shiny              1.7.1    2021-10-02 [1] CRAN (R 4.2.0)
 sp                 1.4-7    2022-04-20 [1] CRAN (R 4.2.0)
 SparseM            1.81     2021-02-18 [1] CRAN (R 4.2.0)
 spData             2.0.1    2021-10-14 [1] CRAN (R 4.2.0)
 spdep              1.2-4    2022-04-18 [1] CRAN (R 4.2.0)
 stringi            1.7.6    2021-11-29 [1] CRAN (R 4.2.0)
 stringr            1.4.0    2019-02-10 [1] CRAN (R 4.2.0)
 terra              1.5-21   2022-02-17 [1] CRAN (R 4.2.0)
 testthat           3.1.3    2022-03-29 [1] CRAN (R 4.2.0)
 tibble             3.1.6    2021-11-07 [1] CRAN (R 4.2.0)
 tidyr              1.2.0    2022-02-01 [1] CRAN (R 4.2.0)
 tidyselect         1.1.2    2022-02-21 [1] CRAN (R 4.2.0)
 tzdb               0.3.0    2022-03-28 [1] CRAN (R 4.2.0)
 units              0.8-0    2022-02-05 [1] CRAN (R 4.2.0)
 UpSetR             1.4.0    2019-05-22 [1] CRAN (R 4.2.0)
 usethis            2.1.5    2021-12-09 [1] CRAN (R 4.2.0)
 utf8               1.2.2    2021-07-24 [1] CRAN (R 4.2.0)
 uuid               1.1-0    2022-04-19 [1] CRAN (R 4.2.0)
 vctrs              0.4.1    2022-04-13 [1] CRAN (R 4.2.0)
 vegan              2.6-2    2022-04-17 [1] CRAN (R 4.2.0)
 vroom              1.5.7    2021-11-30 [1] CRAN (R 4.2.0)
 withr              2.5.0    2022-03-03 [1] CRAN (R 4.2.0)
 wk                 0.6.0    2022-01-03 [1] CRAN (R 4.2.0)
 XML                3.99-0.9 2022-02-24 [1] CRAN (R 4.2.0)
 xml2               1.3.3    2021-11-30 [1] CRAN (R 4.2.0)
 xtable             1.8-4    2019-04-21 [1] CRAN (R 4.2.0)
 XVector            0.35.0   2022-02-18 [1] Bioconductor
 zlibbioc           1.41.0   2022-01-13 [1] Bioconductor

 [1] C:/Users/Jag/AppData/Local/R/win-library/4.2
 [2] C:/Program Files/R/R-4.2.0/library```
jessica-morrison commented 2 years ago

Hi,

I am also having this issue, is there a way to resolve it and continue the filtering?

Thanks, Jess

BernatBurriel commented 1 year ago

Hi,

I'm encountering the same problem. Have you been able to solve it?

Sapicq commented 1 year ago

Hello,

I'am having also the same kind of issue... I am trying to use the fonction tidy_genomic_data in radiator to convert a VCF file to a tidy.vcf format. I had already successfully converted the same vcf file with this function in a previous version of radiator (v1.1.5).

Installation Windows Server 2012 R2 Standard R version 4.2.1 radiator_1.2.2 SeqArray_1.36.3

Code

library(radiator) tidy.vcf <- tidy_genomic_data(data = "HapMap_MD_S80I25_MAF0.01_He0.6_hwe_SnpD_IndD_1SNP_LD_1156_Assigner_20200501.vcf", strata = "Pop_assigner_19g_updtae_20220915.txt") ################################################################################ ######################### radiator::tidy_genomic_data ########################## ################################################################################ Execution date@time: 20220922@1858 Folder created: -8_radiator_tidy_genomic_20220922@1858 Function call and arguments stored in: radiator_tidy_genomic_data_args_20220922@1858.tsv Analyzing strata file Number of strata: 19 Number of individuals: 1156 Importing and tidying the VCF... Execution date@time: 20220922@1858

Reading VCF...

Data summary: number of samples: 1156 number of markers: 2340 Error in SeqArray::seqGetData(gdsfile = data, var.name = "$ref") : The GDS node "$ref" does not exist.

Computation time, overall: 18 sec ############################## completed tidy_vcf ##############################

Computation time, overall: 18 sec ######################### completed tidy_genomic_data ##########################

Interestingly when I used a subset of my dataset (only 116 SNPs with 1156 individuals), I have another message error:

tidy.vcf <- tidy_genomic_data(data = "/Users/Admin/Documents/HapMap_1156_subset_Assigner_20220922.vcf", strata = "/Users/Admin/Documents/Pop_assigner_19g_updtae_20220915.txt") ################################################################################ ######################### radiator::tidy_genomic_data ########################## ################################################################################ Execution date@time: 20220922@2008 Folder created: 00_radiator_tidy_genomic_20220922@2008 Function call and arguments stored in: radiator_tidy_genomic_data_args_20220922@2008.tsv Analyzing strata file Number of strata: 19 Number of individuals: 1156 Importing and tidying the VCF... Execution date@time: 20220922@2008

Reading VCF...

Data summary: number of samples: 1156 number of markers: 116

Filter monomorphic markers Number of individuals / strata / chrom / locus / SNP: Blacklisted: 0 / 0 / 0 / 0 / 0 Error in SeqArray::seqMissing(gdsfile = x, per.variant = TRUE, .progress = FALSE, : unused argument (.progress = FALSE) In addition: Warning message: In as.POSIXlt.POSIXct(x, tz) : unable to identify current timezone 'C': please set environment variable 'TZ'

Computation time, overall: 2 sec ############################## completed tidy_vcf ##############################

Computation time, overall: 2 sec ######################### completed tidy_genomic_data ##########################

Fichiers file.zip

Thanks for the help !

Sapicq commented 1 year ago

Hi all,

I have solved the problem with the function do.call from the R.utils package. This does not return an error if unused arguments are passed in the functions of the package SeqArray. Indeed, several changes in the arguments of the functions in the new version of the package SeqArray are not yet taken into account in the current version of the radiator package.

R.utils::doCall(genomic_converter("file.vcf",output = "file.hzar", verbose = TRUE))

but the bad news is that I still get the following error

Error in SeqArray::seqGetData(gdsfile = data, var.name = "$ref") :
The GDS node "$ref" does not exist.
zjons commented 1 year ago

Hi Thierry,

I am seeing the what appears to be the same error when running radiator::filter_rad()

data <- filter_rad(data = "populations.snps.vcf", interactive.filter = TRUE, output = c("vcf", "plink"), filename = NULL, verbose = TRUE, parallel.core = parallel::detectCores() - 1)

Reading VCF...

Data summary: number of samples: 141 number of markers: 43678

Generating individual stats... Generating markers stats...
Error in SeqArray::seqAlleleCount(gdsfile = gds, ref.allele = NULL, .progress = TRUE, : unused argument (.progress = TRUE)

I tried to find a [workaround], downgrading SeqArray etc. but that did not work.

Ended up forking (zjons/radiator) and changing ".progress = TRUE" appeared to "verbose = TRUE" in every call to SeqArray:: that I could find (in gds.R, filter_mac.R and filter_common_markers.R) where. That seems to do the trick. At least I can read in the file now.

thierrygosselin commented 1 year ago

Should work with v.1.2.3