thierrygosselin / radiator

RADseq Data Exploration, Manipulation and Visualization using R
https://thierrygosselin.github.io/radiator/
GNU General Public License v3.0
59 stars 23 forks source link

Error when using genomic_converter genlight --> pcadapt #171

Closed GabryS3 closed 1 year ago

GabryS3 commented 1 year ago

Describe the bug Hi, When I try to use "genomic_converter" from genlight to pcadapt file, I get the following error:

_################################################################################ ######################### radiator::genomic_converter ########################## ################################################################################ Execution date@time: 20230112@1247 Folder created: -13330_radiator_genomic_converter_20230112@1247 Function call and arguments stored in: radiator_genomic_converter_args_20230112@1247.tsv Filters parameters file generated: filters_parameters_20230112@1247.tsv

Importing data: genlight Calibrating REF/ALT alleles...

Writing tidy data set: test_myGenlight_pcadapt.rad

Preparing data for output

Data is bi-allelic Generating pcadapt file and object Generating pcadapt file... ################################################################################ ####################### radiator::filter_common_markers ######################## ################################################################################ Execution date@time: 20230112@1247 Scanning for common markers...

Computation time, overall: 0 sec ####################### completed filter_common_markers ######################## ################################################################################ ######################### radiator::filter_monomorphic ######################### ################################################################################ Execution date@time: 20230112@1247 Scanning for monomorphic markers...

Computation time, overall: 1 sec ######################### completed filter_monomorphic ######################### Error in dplyr::select(): ! Can't subset columns that don't exist. ✖ Column POP_ID doesn't exist. Run rlang::last_error() to see where the error occurred. Warning messages: 1: In list.dirs(path = path.folder, full.names = FALSE) : over-long path 2: In list.dirs(path = path.folder, full.names = FALSE) : over-long path 3: Unknown or uninitialised column: POP_ID._

Computation time, overall: 15 sec ######################### completed genomic_converter ##########################

To Reproduce Include the steps to reproduce the behavior:

I would really appreciate some help. I used _genomicconverter before to obtain a stockR fprmat from a genlight object, so I am not sure what is the problem now. Thanks a lot Best, Gabriella

thierrygosselin commented 1 year ago

Dear Gabriella, It's difficult to reproduce the problem. Try reinstalling radiator with latest version 1.2.5 (will be updated in a few hours) Re-run the command and re-open issue if you are still having problem Ideally, send the data by email Best Thierry

GabryS3 commented 1 year ago

Hi Thierry, Thank you for your reply and sorry for my super late response. I have been so busy at work could not even test reinstalling Radiator. I will do it soon and let you know if I still have issues. Thank you Best, Gabriella

From: Thierry Gosselin @.***> Sent: Saturday, 21 January 2023 4:11 AM To: thierrygosselin/radiator Cc: Gabriella Scata; Author Subject: Re: [thierrygosselin/radiator] Error when using genomic_converter genlight --> pcadapt (Issue #171)

Dear Gabriella, It's difficult to reproduce the problem. Try reinstalling radiator with latest version 1. 2. 5 (will be updated in a few hours) Re-run the command and re-open issue if you are still having problem Ideally, send the data by email

Dear Gabriella, It's difficult to reproduce the problem. Try reinstalling radiator with latest version 1.2.5 (will be updated in a few hours) Re-run the command and re-open issue if you are still having problem Ideally, send the data by email Best Thierry

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/thierrygosselin/radiator/issues/171*issuecomment-1398760825__;Iw!!PUY2jUP3Fp7oEg!EGt7gTEtBedhAFl1qlYRgIKKzXu3Za7Y2hohxzM1-7_8DvdTj1zyVXWWdPIw3nTsfEqaLegl68NjG7uSYwKfwxrIPUSggDaiMIQv$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/A2BDRZAK2P7K64TXBDREUH3WTLIMZANCNFSM6AAAAAATYXXMNU__;!!PUY2jUP3Fp7oEg!EGt7gTEtBedhAFl1qlYRgIKKzXu3Za7Y2hohxzM1-7_8DvdTj1zyVXWWdPIw3nTsfEqaLegl68NjG7uSYwKfwxrIPUSggGup2deU$. You are receiving this because you authored the thread.Message ID: @.***>


The information in this email together with any attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. There is no waiver of any confidentiality/privilege by your inadvertent receipt of this material. Any form of review, disclosure, modification, distribution and/or publication of this email message is prohibited, unless as a necessary part of Departmental business. If you have received this message in error, you are asked to inform the sender as quickly as possible and delete this message and any copies of this message from your computer and/or your computer system network.

GabryS3 commented 1 year ago

Hi Thierry, I finally managed to have genomic_converter() work after installing the new version of Radiator (1.2.8). So, thank you for the suggestion.

However, now I have 2 issues (while using the Radiator version 1.2.8):

  1. I want to understand how the tidy format is obtained (to understand if I get a correct conversion into tidy of my genlight object)
  2. I get an error when using detect_duplicate_genomes() function --> Error in value_vars(value.var, names(data)) : value.var values [n] are not found in 'data'.

1). Question 1 - tidy format: I just want to understand if the function "genomic_converter" is working properly now that I installed the most recent version of Radiator (1.2.8).

I used "genomic_converter" to convert my "genlight" dataset --> into a "tidy" dataset. Code: < My_dataset_TIDY = genomic_converter( My_dataset, # class = genlight object strata = NULL, output = "tidy", filename = "My_dataset_TIDY", parallel.core = parallel::detectCores() - 1, verbose = TRUE)

However, I cannot understand if the conversion was correct or whether there are issues.

First of all, did I needed to include a "STRATA" file? I did not include any. My genlight object was obtained through the package dartR.

Second, my concern is that the genotypes are not properly coded. For example:

TIDY format: individual 1, chrom1_locus 1002_42_A_T_42 --> REF = T, ALT = A, GT_BIN = 2 GENLIGHT format: individual 1, chrom1_locus 1002_42_A_T_42 --> genotype = 0 (= homozygous for REF allele)

TIDY format: individual 2, chrom1_locus 1002_42_A_T_42 --> REF = T, ALT = A, GT_BIN = 1 GENLIGHT format: individual 2, chrom1_locus 1002_42_A_T_42 --> genotype = 1 (heterozygous)

TIDY format: individual 3, chrom1_locus 1002_42_A_T_42 --> REF = T, ALT = A, GT_BIN = 0 GENLIGHT format: individual 3, chrom1_locus 1002_42_A_T_42 --> genotype = 2 (= homozygous for ALT allele = SNP)

What is the GT_BIN & How is it coded? Is this genotype coding transformation reported above correct? From my understanding, it seems that GT_BIN codes the genotype in an opposite way compared to the genlight object, correct?

2). Question 2 After converting my genlight object into tidy format with the code above (with genomic_converter() ) --> I then tried to use "detect_duplicate_genomes" on my tidy dataset. However, I get the following error: Code: <My_dataset_duplicate_genomes = detect_duplicate_genomes( data = "My_dataset_TIDY.rad", interactive.filter = TRUE, detect.duplicate.genomes = TRUE, dup.threshold = 0, distance.method = "manhattan", genome = FALSE, threshold.common.markers = NULL, blacklist.duplicates = FALSE, parallel.core = parallel::detectCores() - 1, verbose = TRUE)

################################################################################ ###################### radiator::detect_duplicate_genomes ###################### ################################################################################ Execution @.: @. Folder created: @. Function call and arguments stored in a file File written: @*.**@*.> File written: random.seed (247023) Filters parameters file generated: @*.**@*.*> Preparing data for analysis Calculating manhattan distances between individuals... Error in value_vars(value.var, names(data)) : value.var values [n] are not found in 'data'.** In addition: There were 50 or more warnings (use warnings() to see the first 50)

Computation time, overall: 36 sec ###################### completed detect_duplicate_genomes ######################

What does this error "Error in value_vars(value.var, names(data)) : value.var values [n] are not found in 'data'." mean?

I would really appreciate your help, as I have been trying to use this function for a while now, always incurring in some issue on the way... Thanks a lot! Best, Gabriella

devtools session info: devtools::session_info() ─ Session info ────────────────────────────────────────────────────────────────────────────────────────────────────────── setting value version R version 4.2.1 (2022-06-23 ucrt) os Windows 10 x64 (build 19044) system x86_64, mingw32 ui RStudio language (EN) collate English_Australia.utf8 ctype English_Australia.utf8 tz Australia/Brisbane date 2023-05-03 rstudio 2022.07.0+548 Spotted Wakerobin (desktop) pandoc NA

─ Packages ────────────────────────────────────────────────────────────────────────────────────────────────────────────── ! package version date (UTC) lib source ade4 1.7-19 2022-04-19 [1] CRAN (R 4.2.1) adegenet 2.1.7 2022-06-06 [1] CRAN (R 4.2.1) amap 0.8-19 2022-10-28 [1] CRAN (R 4.2.1) ape 5.6-2 2022-03-02 [1] CRAN (R 4.2.1) assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.2.1) backports 1.4.1 2021-12-13 [1] CRAN (R 4.2.0) BiocGenerics 0.42.0 2022-04-26 [1] Bioconductor BiocManager 1.30.18 2022-05-18 [1] CRAN (R 4.2.1) bit 4.0.4 2020-08-04 [1] CRAN (R 4.2.1) bit64 4.0.5 2020-08-30 [1] CRAN (R 4.2.1) broom 1.0.0 2022-07-01 [1] CRAN (R 4.2.1) cachem 1.0.6 2021-08-19 [1] CRAN (R 4.2.1) calibrate 1.7.7 2020-06-19 [1] CRAN (R 4.2.1) callr 3.7.1 2022-07-13 [1] CRAN (R 4.2.1) cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.2.1) cli 3.4.1 2022-09-23 [1] CRAN (R 4.2.2) cluster 2.1.3 2022-03-28 [2] CRAN (R 4.2.1) codetools 0.2-18 2020-11-04 [2] CRAN (R 4.2.1) colorspace 2.0-3 2022-02-21 [1] CRAN (R 4.2.1) combinat 0.0-8 2012-10-29 [1] CRAN (R 4.2.0) crayon 1.5.1 2022-03-26 [1] CRAN (R 4.2.1) VP dartR 2.9.4 2022-06-05 [?] CRAN (R 4.2.1) (on disk 2.0.4) dartR.data 1.0.2 2022-11-16 [1] CRAN (R 4.2.2) data.table 1.14.2 2021-09-27 [1] CRAN (R 4.2.1) DBI 1.1.3 2022-06-18 [1] CRAN (R 4.2.1) dbplyr 2.2.1 2022-06-27 [1] CRAN (R 4.2.1) devtools 2.4.3 2021-11-30 [1] CRAN (R 4.2.1) digest 0.6.29 2021-12-01 [1] CRAN (R 4.2.1) dismo 1.3-5 2021-10-11 [1] CRAN (R 4.2.1) doParallel 1.0.17 2022-02-07 [1] CRAN (R 4.2.1) dotCall64 1.0-1 2021-02-11 [1] CRAN (R 4.2.1) dplyr 1.0.9 2022-04-28 [1] CRAN (R 4.2.1) ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.1) fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.1) fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.1) fields 14.0 2022-07-05 [1] CRAN (R 4.2.1) forcats 0.5.1 2021-01-27 [1] CRAN (R 4.2.1) foreach 1.5.2 2022-02-02 [1] CRAN (R 4.2.1) fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.1) fst 0.9.8 2022-02-08 [1] CRAN (R 4.2.1) fstcore 0.9.12 2022-03-23 [1] CRAN (R 4.2.1) gap 1.2.3-6 2022-05-13 [1] CRAN (R 4.2.1) gap.datasets 0.0.5 2022-05-09 [1] CRAN (R 4.2.0) gdata 2.18.0.1 2022-05-10 [1] CRAN (R 4.2.1) gdistance 1.3-6 2020-06-29 [1] CRAN (R 4.2.1) gdsfmt 1.32.0 2022-04-26 [1] Bioconductor generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.1) genetics 1.3.8.1.3 2021-03-01 [1] CRAN (R 4.2.1) GGally 2.1.2 2021-06-21 [1] CRAN (R 4.2.1) ggplot2 3.4.0 2022-11-04 [1] CRAN (R 4.2.2) glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.1) gridExtra 2.3 2017-09-09 [1] CRAN (R 4.2.1) gtable 0.3.0 2019-03-25 [1] CRAN (R 4.2.1) gtools 3.9.3 2022-07-11 [1] CRAN (R 4.2.1) haven 2.5.0 2022-04-15 [1] CRAN (R 4.2.1) hms 1.1.1 2021-09-26 [1] CRAN (R 4.2.1) htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.2.1) httpuv 1.6.5 2022-01-05 [1] CRAN (R 4.2.1) httr 1.4.3 2022-05-04 [1] CRAN (R 4.2.1) igraph 1.3.2 2022-06-13 [1] CRAN (R 4.2.1) iterators 1.0.14 2022-02-05 [1] CRAN (R 4.2.1) jsonlite 1.8.0 2022-02-22 [1] CRAN (R 4.2.1) knitr 1.39 2022-04-26 [1] CRAN (R 4.2.1) later 1.3.0 2021-08-18 [1] CRAN (R 4.2.1) lattice 0.20-45 2021-09-22 [2] CRAN (R 4.2.1) LEA 3.8.0 2022-04-26 [1] Bioconductor lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.2) lubridate 1.8.0 2021-10-07 [1] CRAN (R 4.2.1) magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.1) maps 3.4.0 2021-09-25 [1] CRAN (R 4.2.1) MASS 7.3-57 2022-04-22 [2] CRAN (R 4.2.1) Matrix 1.4-1 2022-03-23 [2] CRAN (R 4.2.1) memoise 2.0.1 2021-11-26 [1] CRAN (R 4.2.1) mgcv 1.8-40 2022-03-29 [2] CRAN (R 4.2.1) mime 0.12 2021-09-28 [1] CRAN (R 4.2.0) mmod 1.3.3 2017-04-06 [1] CRAN (R 4.2.1) modelr 0.1.8 2020-05-19 [1] CRAN (R 4.2.1) munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.1) mvtnorm 1.1-3 2021-10-08 [1] CRAN (R 4.2.0) naniar 1.0.0 2023-02-02 [1] CRAN (R 4.2.3) nlme 3.1-157 2022-03-25 [2] CRAN (R 4.2.1) OutFLANK * 0.2 2022-07-18 [1] Github @.) patchwork 1.1.1 2020-12-17 [1] CRAN (R 4.2.1) pegas 1.1 2021-12-16 [1] CRAN (R 4.2.1) permute 0.9-7 2022-01-27 [1] CRAN (R 4.2.1) pillar 1.7.0 2022-02-01 [1] CRAN (R 4.2.1) pinfsc50 1.2.0 2020-06-03 [1] CRAN (R 4.2.0) pkgbuild 1.3.1 2021-12-20 [1] CRAN (R 4.2.1) pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.1) pkgload 1.3.0 2022-06-27 [1] CRAN (R 4.2.1) plotrix 3.8-2 2021-09-08 [1] CRAN (R 4.2.0) plyr 1.8.7 2022-03-24 [1] CRAN (R 4.2.1) png 0.1-7 2013-12-03 [1] CRAN (R 4.2.0) PopGenReport 3.0.7 2022-05-27 [1] CRAN (R 4.2.1) prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.2.1) processx 3.7.0 2022-07-07 [1] CRAN (R 4.2.1) promises 1.2.0.1 2021-02-11 [1] CRAN (R 4.2.1) ps 1.7.1 2022-06-18 [1] CRAN (R 4.2.1) purrr 0.3.4 2020-04-17 [1] CRAN (R 4.2.1) qvalue 2.28.0 2022-04-26 [1] Bioconductor R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.2.0) R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.2.0) R.utils 2.12.0 2022-06-28 [1] CRAN (R 4.2.1) R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.1) VP radiator 1.2.8 2022-07-16 [?] Github **@.) (on disk 1.2.2) raster 3.5-21 2022-06-27 [1] CRAN (R 4.2.1) RColorBrewer 1.1-3 2022-04-03 [1] CRAN (R 4.2.0) Rcpp 1.0.9 2022-07-08 [1] CRAN (R 4.2.1) readr 2.1.2 2022-01-30 [1] CRAN (R 4.2.1) readxl 1.4.0 2022-03-28 [1] CRAN (R 4.2.1) remotes 2.4.2 2021-11-30 [1] CRAN (R 4.2.1) reprex 2.0.1 2021-08-05 [1] CRAN (R 4.2.1) reshape 0.8.9 2022-04-12 [1] CRAN (R 4.2.1) reshape2 1.4.4 2020-04-09 [1] CRAN (R 4.2.1) rgdal 1.5-32 2022-05-09 [1] CRAN (R 4.2.1) RgoogleMaps 1.4.5.3 2020-02-12 [1] CRAN (R 4.2.1) rlang 1.0.6 2022-09-24 [1] CRAN (R 4.2.2) rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.2.1) rvest 1.0.2 2021-10-16 [1] CRAN (R 4.2.1) scales 1.2.0 2022-04-13 [1] CRAN (R 4.2.1) seqinr 4.2-16 2022-05-19 [1] CRAN (R 4.2.1) sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.1) shiny 1.7.1 2021-10-02 [1] CRAN (R 4.2.1) SNPRelate 1.30.1 2022-05-15 [1] Bioconductor snpStats 1.46.0 2022-04-26 [1] Bioconductor sp 1.5-0 2022-06-05 [1] CRAN (R 4.2.1) spam 2.9-0 2022-07-11 [1] CRAN (R 4.2.1) spida2 0.2.1 2023-04-26 [1] Github @.) StAMPP 1.6.3 2021-08-08 [1] CRAN (R 4.2.1) stockR 1.0.74 2020-03-04 [1] CRAN (R 4.2.1) stringi 1.7.8 2022-07-11 [1] CRAN (R 4.2.1) stringr 1.4.1 2022-08-20 [1] CRAN (R 4.2.1) survival 3.3-1 2022-03-03 [2] CRAN (R 4.2.1) terra 1.5-34 2022-06-09 [1] CRAN (R 4.2.1) tibble 3.1.7 2022-05-03 [1] CRAN (R 4.2.1) tidyr 1.2.0 2022-02-01 [1] CRAN (R 4.2.1) tidyselect 1.1.2 2022-02-21 [1] CRAN (R 4.2.1) tidyverse 1.3.1 2021-04-15 [1] CRAN (R 4.2.1) tzdb 0.3.0 2022-03-28 [1] CRAN (R 4.2.1) usethis 2.1.6 2022-05-25 [1] CRAN (R 4.2.1) utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.1) vcfR 1.12.0 2020-09-01 [1] CRAN (R 4.2.1) vctrs 0.5.1 2022-11-16 [1] CRAN (R 4.2.2) vegan 2.6-2 2022-04-17 [1] CRAN (R 4.2.1) versions 0.3 2016-09-01 [1] CRAN (R 4.2.0) viridis 0.6.2 2021-10-13 [1] CRAN (R 4.2.1) viridisLite 0.4.0 2021-04-13 [1] CRAN (R 4.2.1) visdat 0.6.0 2023-02-02 [1] CRAN (R 4.2.3) vroom 1.5.7 2021-11-30 [1] CRAN (R 4.2.1) withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.1) xfun 0.31 2022-05-10 [1] CRAN (R 4.2.1) xml2 1.3.3 2021-11-30 [1] CRAN (R 4.2.1) xtable 1.8-4 2019-04-21 [1] CRAN (R 4.2.1) zlibbioc 1.42.0 2022-04-26 [1] Bioconductor

[1] C:/Users/scatag/AppData/Local/R/win-library/4.2 [2] C:/Program Files/R/R-4.2.1/library

V ── Loaded and on-disk version mismatch. P ── Loaded and on-disk path mismatch.

──────────────────────────────────────


The information in this email together with any attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. There is no waiver of any confidentiality/privilege by your inadvertent receipt of this material. Any form of review, disclosure, modification, distribution and/or publication of this email message is prohibited, unless as a necessary part of Departmental business. If you have received this message in error, you are asked to inform the sender as quickly as possible and delete this message and any copies of this message from your computer and/or your computer system network.