thierrygosselin / radiator

RADseq Data Exploration, Manipulation and Visualization using R
https://thierrygosselin.github.io/radiator/
GNU General Public License v3.0
58 stars 23 forks source link

genomic_converter errors #174

Closed jcaccavo closed 1 year ago

jcaccavo commented 1 year ago

I am experiencing errors using the genomic_converter function to convert to genlight, bayescan, and pcadapt from a tidy data frame (tbl_df) produced from a .vcf file using filter_rad. I have the same problem when trying to convert to these formats from the original .vcf file, and also when trying to use the write_genlight, write_bayescan, or write_pcadapt functions. I have also reviewed multiple github issues related to the genomic_converted function (#103, #111, #145, #154, and #162), but could not find a solution to this issue there.

The following files can be downloaded to reproduce this error: Rdata file (output from the filter_rad of the original .vcf file) Original .vcf file Strata file

Below are the commands, errors, and session report. I am happy to provide more info as needed. Thanks in advance for your help.

dm_all_convert_genlight <- radiator::genomic_converter(dm_data_filtered, output = c("genlight"))

################################################################################
######################### radiator::genomic_converter ##########################
################################################################################
Execution date@time: 20230203@1121
Folder created: 06_radiator_genomic_converter_20230203@1121
Function call and arguments stored in: radiator_genomic_converter_args_20230203@1121.tsv
Filters parameters file generated: filters_parameters_20230203@1121.tsv
ℹ Importing data: tbl_df
Writing tidy data set:
radiator_data_20230203@1121.rad
✔ Importing data: tbl_df [3.1s]
Data is bi-allelic
✔ Preparing data [191ms]
Error in `dplyr::arrange()`:ht object
! Problem with the implicit `transmute()` step.
✖ Problem while computing `..1 = POP_ID`.
Caused by error in `mask$eval_all_mutate()`:
! object 'POP_ID' not found
Run `rlang::last_error()` to see where the error occurred.

Computation time, overall: 5 sec
######################### completed genomic_converter ##########################
✖ Generating adegenet genlight object [1.2s]

dm_all_convert_bayescan <- radiator::genomic_converter(dm_data_filtered, output = c("bayescan"))

################################################################################
######################### radiator::genomic_converter ##########################
################################################################################
Execution date@time: 20230203@1125
Folder created: 07_radiator_genomic_converter_20230203@1125
Function call and arguments stored in: radiator_genomic_converter_args_20230203@1125.tsv
Filters parameters file generated: filters_parameters_20230203@1125.tsv
ℹ Importing data: tbl_df
Writing tidy data set:
radiator_data_20230203@1125.rad
✔ Importing data: tbl_df [3.1s]
Data is bi-allelic
✔ Preparing data [79ms]
Generating BayeScan file...
Error in radiator::write_bayescan(data = input, pop.select = pop.select,  : 
  object 'pop.select' not found

Computation time, overall: 3 sec
######################### completed genomic_converter ##########################
✖ Generating BayeScan [76ms]
> # Error in `dplyr::select()`:
> #   ! Can't subset columns that don't exist.
> # ✖ Column `POP_ID` doesn't exist.

dm_all_convert_pcadapt <- radiator::genomic_converter(dm_data_filtered, output = c("pcadapt"))

################################################################################
######################### radiator::genomic_converter ##########################
################################################################################
Execution date@time: 20230203@1126
Folder created: 08_radiator_genomic_converter_20230203@1126
Function call and arguments stored in: radiator_genomic_converter_args_20230203@1126.tsv
Filters parameters file generated: filters_parameters_20230203@1126.tsv
ℹ Importing data: tbl_df
Writing tidy data set:
radiator_data_20230203@1126.rad
✔ Importing data: tbl_df [3.3s]
Data is bi-allelic
✔ Preparing data [81ms]
Generating pcadapt file...and object
################################################################################
####################### radiator::filter_common_markers ########################
################################################################################
Execution date@time: 20230203@1126
Scanning for common markers...

Computation time, overall: 1 sec
####################### completed filter_common_markers ########################
################################################################################
######################### radiator::filter_monomorphic #########################
################################################################################
Execution date@time: 20230203@1126
Scanning for monomorphic markers...

Computation time, overall: 1 sec
######################### completed filter_monomorphic #########################
Error in `dplyr::select()`:
! Can't subset columns that don't exist.
✖ Column `POP_ID` doesn't exist.
Run `rlang::last_error()` to see where the error occurred.
Warning message:
Unknown or uninitialised column: `POP_ID`. 

Computation time, overall: 5 sec
######################### completed genomic_converter ##########################
✖ Generating pcadapt file and object [1.2s]

devtools::session_info()

─ Session info ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.2.2 (2022-10-31)
 os       macOS Monterey 12.6
 system   x86_64, darwin17.0
 ui       RStudio
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/Paris
 date     2023-02-03
 rstudio  2022.12.0+353 Elsbeth Geranium (desktop)
 pandoc   NA

─ Packages ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package          * version  date (UTC) lib source
 ade4             * 1.7-20   2022-11-01 [1] CRAN (R 4.2.0)
 adegenet         * 2.1.9    2023-01-12 [1] CRAN (R 4.2.0)
 ape                5.6-2    2022-03-02 [1] CRAN (R 4.2.0)
 assertthat         0.2.1    2019-03-21 [1] CRAN (R 4.2.0)
 BiocGenerics       0.44.0   2022-11-01 [1] Bioconductor
 BiocManager        1.30.19  2022-10-25 [1] CRAN (R 4.2.0)
 Biostrings         2.66.0   2022-11-01 [1] Bioconductor
 bit                4.0.5    2022-11-15 [1] CRAN (R 4.2.0)
 bit64              4.0.5    2020-08-30 [1] CRAN (R 4.2.0)
 bitops             1.0-7    2021-04-24 [1] CRAN (R 4.2.0)
 cachem             1.0.6    2021-08-19 [1] CRAN (R 4.2.0)
 callr              3.7.3    2022-11-02 [1] CRAN (R 4.2.0)
 carrier            0.1.0    2018-10-16 [1] CRAN (R 4.2.0)
 cli                3.6.0    2023-01-09 [1] CRAN (R 4.2.0)
 cluster            2.1.4    2022-08-22 [1] CRAN (R 4.2.2)
 colorspace         2.1-0    2023-01-23 [1] CRAN (R 4.2.0)
 crayon             1.5.2    2022-09-29 [1] CRAN (R 4.2.0)
 data.table         1.14.6   2022-11-16 [1] CRAN (R 4.2.0)
 DBI                1.1.3    2022-06-18 [1] CRAN (R 4.2.0)
 devtools           2.4.5    2022-10-11 [1] CRAN (R 4.2.0)
 digest             0.6.31   2022-12-11 [1] CRAN (R 4.2.0)
 dplyr              1.0.10   2022-09-01 [1] CRAN (R 4.2.0)
 ellipsis           0.3.2    2021-04-29 [1] CRAN (R 4.2.0)
 fansi              1.0.4    2023-01-22 [1] CRAN (R 4.2.0)
 farver             2.1.1    2022-07-06 [1] CRAN (R 4.2.0)
 fastmap            1.1.0    2021-01-25 [1] CRAN (R 4.2.0)
 fs                 1.6.0    2023-01-23 [1] CRAN (R 4.2.0)
 fst                0.9.8    2022-02-08 [1] CRAN (R 4.2.0)
 fstcore          * 0.9.14   2023-01-12 [1] CRAN (R 4.2.0)
 gdsfmt             1.34.0   2022-11-01 [1] Bioconductor
 generics           0.1.3    2022-07-05 [1] CRAN (R 4.2.0)
 GenomeInfoDb       1.34.7   2023-01-24 [1] Bioconductor
 GenomeInfoDbData   1.2.9    2023-01-19 [1] Bioconductor
 GenomicRanges      1.50.2   2022-12-16 [1] Bioconductor
 ggplot2            3.4.0    2022-11-04 [1] CRAN (R 4.2.0)
 glue               1.6.2    2022-02-24 [1] CRAN (R 4.2.0)
 gridExtra          2.3      2017-09-09 [1] CRAN (R 4.2.0)
 gtable             0.3.1    2022-09-01 [1] CRAN (R 4.2.0)
 hms                1.1.2    2022-08-19 [1] CRAN (R 4.2.0)
 htmltools          0.5.4    2022-12-07 [1] CRAN (R 4.2.0)
 htmlwidgets        1.6.1    2023-01-07 [1] CRAN (R 4.2.0)
 httpuv             1.6.8    2023-01-12 [1] CRAN (R 4.2.0)
 igraph             1.3.5    2022-09-22 [1] CRAN (R 4.2.0)
 IRanges            2.32.0   2022-11-01 [1] Bioconductor
 labeling           0.4.2    2020-10-20 [1] CRAN (R 4.2.0)
 later              1.3.0    2021-08-18 [1] CRAN (R 4.2.0)
 lattice            0.20-45  2021-09-22 [1] CRAN (R 4.2.2)
 lifecycle          1.0.3    2022-10-07 [1] CRAN (R 4.2.0)
 magrittr           2.0.3    2022-03-30 [1] CRAN (R 4.2.0)
 MASS               7.3-58.2 2023-01-23 [1] CRAN (R 4.2.0)
 Matrix             1.5-3    2022-11-11 [1] CRAN (R 4.2.0)
 matrixStats        0.63.0   2022-11-18 [1] CRAN (R 4.2.0)
 memoise            2.0.1    2021-11-26 [1] CRAN (R 4.2.0)
 mgcv               1.8-41   2022-10-21 [1] CRAN (R 4.2.2)
 mime               0.12     2021-09-28 [1] CRAN (R 4.2.0)
 miniUI             0.1.1.1  2018-05-18 [1] CRAN (R 4.2.0)
 munsell            0.5.0    2018-06-12 [1] CRAN (R 4.2.0)
 nlme               3.1-161  2022-12-15 [1] CRAN (R 4.2.0)
 permute            0.9-7    2022-01-27 [1] CRAN (R 4.2.0)
 pillar             1.8.1    2022-08-19 [1] CRAN (R 4.2.0)
 pkgbuild           1.4.0    2022-11-27 [1] CRAN (R 4.2.0)
 pkgconfig          2.0.3    2019-09-22 [1] CRAN (R 4.2.0)
 pkgload            1.3.2    2022-11-16 [1] CRAN (R 4.2.0)
 plyr               1.8.8    2022-11-11 [1] CRAN (R 4.2.0)
 prettyunits        1.1.1    2020-01-24 [1] CRAN (R 4.2.0)
 processx           3.8.0    2022-10-26 [1] CRAN (R 4.2.0)
 profvis            0.3.7    2020-11-02 [1] CRAN (R 4.2.0)
 promises           1.2.0.1  2021-02-11 [1] CRAN (R 4.2.0)
 ps                 1.7.2    2022-10-26 [1] CRAN (R 4.2.0)
 purrr              1.0.1    2023-01-10 [1] CRAN (R 4.2.0)
 R6                 2.5.1    2021-08-19 [1] CRAN (R 4.2.0)
 radiator         * 1.2.5    2023-01-26 [1] Github (thierrygosselin/radiator@430ef84)
 ragg               1.2.5    2023-01-12 [1] CRAN (R 4.2.0)
 Rcpp               1.0.10   2023-01-22 [1] CRAN (R 4.2.0)
 RCurl              1.98-1.9 2022-10-03 [1] CRAN (R 4.2.0)
 readr              2.1.3    2022-10-01 [1] CRAN (R 4.2.0)
 remotes            2.4.2    2021-11-30 [1] CRAN (R 4.2.0)
 reshape2           1.4.4    2020-04-09 [1] CRAN (R 4.2.0)
 rlang              1.0.6    2022-09-24 [1] CRAN (R 4.2.0)
 rstudioapi         0.14     2022-08-22 [1] CRAN (R 4.2.0)
 S4Vectors          0.36.1   2022-12-05 [1] Bioconductor
 scales             1.2.1    2022-08-20 [1] CRAN (R 4.2.0)
 SeqArray           1.38.0   2022-11-01 [1] Bioconductor
 seqinr             4.2-23   2022-11-28 [1] CRAN (R 4.2.0)
 sessioninfo        1.2.2    2021-12-06 [1] CRAN (R 4.2.0)
 shiny              1.7.4    2022-12-15 [1] CRAN (R 4.2.0)
 SNPRelate          1.32.2   2023-01-19 [1] Bioconductor
 stringi            1.7.12   2023-01-11 [1] CRAN (R 4.2.0)
 stringr            1.5.0    2022-12-02 [1] CRAN (R 4.2.0)
 systemfonts        1.0.4    2022-02-11 [1] CRAN (R 4.2.0)
 textshaping        0.3.6    2021-10-13 [1] CRAN (R 4.2.0)
 tibble             3.1.8    2022-07-22 [1] CRAN (R 4.2.0)
 tidyr              1.3.0    2023-01-24 [1] CRAN (R 4.2.2)
 tidyselect         1.2.0    2022-10-10 [1] CRAN (R 4.2.0)
 tzdb               0.3.0    2022-03-28 [1] CRAN (R 4.2.0)
 UpSetR             1.4.0    2019-05-22 [1] CRAN (R 4.2.0)
 urlchecker         1.0.1    2021-11-30 [1] CRAN (R 4.2.0)
 usethis            2.1.6    2022-05-25 [1] CRAN (R 4.2.0)
 utf8               1.2.2    2021-07-24 [1] CRAN (R 4.2.0)
 vctrs              0.5.2    2023-01-23 [1] CRAN (R 4.2.0)
 vegan              2.6-4    2022-10-11 [1] CRAN (R 4.2.0)
 vroom              1.6.1    2023-01-22 [1] CRAN (R 4.2.0)
 withr              2.5.0    2022-03-03 [1] CRAN (R 4.2.0)
 xtable             1.8-4    2019-04-21 [1] CRAN (R 4.2.0)
 XVector            0.38.0   2022-11-01 [1] Bioconductor
 zlibbioc           1.44.0   2022-11-01 [1] Bioconductor

 [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
thierrygosselin commented 1 year ago

Will have a look at it tomorrow

thierrygosselin commented 1 year ago

did you have all those warning generated by SeqArray the first time you read the vcf ?

Warning in SeqArray::seqVCF_Header(vcf.fn = vcf) :
  There are too many lines in the header (>= 10000). In order not to slow down the conversion, please consider deleting unnecessary annotations (like contig).
thierrygosselin commented 1 year ago

I see from the other post that you did, ignore this. I removed all the lines in the VCF with contig...

thierrygosselin commented 1 year ago

Should work with v.1.2.6

You can use the separate writing functions instead of specifying radiator::genomic_converter separately:

?radiator::write_genlight
?radiator::write_pcadapt
?radiator::write_bayescan

data.gl <- radiator::write_genlight(data = dm_data_filtered)
data.pc <- radiator::write_pcadapt(data = dm_data_filtered)
data.bs <- radiator::write_bayescan(data = dm_data_filtered)

All at once:

test1 <- radiator::genomic_converter(data = dm_data_filtered, output = c("genlight", "pcadapt", "bayescan"))
thierrygosselin commented 1 year ago

re-open the issue if you're still having problem