thierrygosselin / radiator

RADseq Data Exploration, Manipulation and Visualization using R
https://thierrygosselin.github.io/radiator/
GNU General Public License v3.0
58 stars 23 forks source link

The GDS node "$ref" does not exist. #168

Closed ymatmt closed 1 year ago

ymatmt commented 1 year ago

Thank you for the grate software. I got a "Error in SeqArray::seqGetData(gdsfile = data, var.name = "$ref") : The GDS node "$ref" does not exist." as folows. My question is how will I fix it?

command

p2022<- genomic_converter(data="plink.bed", filename="p2022",output="genepop", parallel.core =11)

output

################################################################################ ######################### radiator::genomic_converter ########################## ################################################################################ Execution date@time: 20221107@1909 Folder created: 03_radiator_genomic_converter_20221107@1909 Function call and arguments stored in: radiator_genomic_converter_args_20221107@1909.tsv Filters parameters file generated: filters_parameters_20221107@1909.tsv

Importing data

Reading PLINK bed file...

Data summary: number of samples: 42 number of markers: 123155 Error in SeqArray::seqGetData(gdsfile = data, var.name = "$ref") : The GDS node "$ref" does not exist.

Computation time, overall: 2 sec

Computation time, overall: 2 sec ######################### completed genomic_converter ##########################

Session info setting value version R version 4.2.2 (2022-10-31) os Ubuntu 20.04.5 LTS system x86_64, linux-gnu ui X11 language en_US collate en_US.UTF-8 ctype en_US.UTF-8 tz Asia/Tokyo date 2022-11-08 pandoc NA

Packages package version date (UTC) lib source ade4 1.7-20 2022-11-01 [1] CRAN (R 4.2.2) adegenet 2.1.8 2022-10-02 [2] CRAN (R 4.2.2) ape 5.6-2 2022-03-02 [2] CRAN (R 4.2.2) assertthat 0.2.1 2019-03-21 [2] CRAN (R 4.2.2) backports 1.4.1 2021-12-13 [1] CRAN (R 4.2.2) Biobase 2.58.0 2022-11-01 [1] Bioconductor BiocGenerics 0.44.0 2022-11-01 [1] Bioconductor Biostrings 2.66.0 2022-11-01 [1] Bioconductor bit 4.0.4 2020-08-04 [2] CRAN (R 4.2.2) bit64 4.0.5 2020-08-30 [2] CRAN (R 4.2.2) bitops 1.0-7 2021-04-24 [1] CRAN (R 4.2.2) broom 1.0.1 2022-08-29 [2] CRAN (R 4.2.2) cachem 1.0.6 2021-08-19 [2] CRAN (R 4.2.2) callr 3.7.3 2022-11-02 [2] CRAN (R 4.2.2) cli 3.4.1 2022-09-23 [1] CRAN (R 4.2.2) cluster 2.1.4 2022-08-22 [4] CRAN (R 4.2.1) colorspace 2.0-3 2022-02-21 [2] CRAN (R 4.2.2) crayon 1.5.2 2022-09-29 [1] CRAN (R 4.2.2) data.table 1.14.4 2022-10-17 [1] CRAN (R 4.2.2) DBI 1.1.3 2022-06-18 [1] CRAN (R 4.2.2) devtools 2.4.5 2022-10-11 [1] CRAN (R 4.2.2) digest 0.6.30 2022-10-18 [1] CRAN (R 4.2.2) dplyr 1.0.10 2022-09-01 [2] CRAN (R 4.2.2) ellipsis 0.3.2 2021-04-29 [2] CRAN (R 4.2.2) fansi 1.0.3 2022-03-24 [2] CRAN (R 4.2.2) fastmap 1.1.0 2021-01-25 [2] CRAN (R 4.2.2) formula.tools 1.7.1 2018-03-01 [1] CRAN (R 4.2.2) fs 1.5.2 2021-12-08 [2] CRAN (R 4.2.2) fst 0.9.8 2022-02-08 [2] CRAN (R 4.2.2) fstcore 0.9.12 2022-03-23 [2] CRAN (R 4.2.2) gdsfmt 1.34.0 2022-11-01 [2] Bioconductor generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.2) GenomeInfoDb 1.34.1 2022-11-03 [1] Bioconductor GenomeInfoDbData 1.2.9 2022-11-07 [1] Bioconductor GenomicRanges 1.50.0 2022-11-01 [1] Bioconductor ggplot2 3.4.0 2022-11-04 [2] CRAN (R 4.2.2) glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.2) gtable 0.3.1 2022-09-01 [1] CRAN (R 4.2.2) GWASExactHW 1.01 2013-01-05 [1] CRAN (R 4.2.2) hms 1.1.2 2022-08-19 [2] CRAN (R 4.2.2) htmltools 0.5.3 2022-07-18 [2] CRAN (R 4.2.2) htmlwidgets 1.5.4 2021-09-08 [2] CRAN (R 4.2.2) httpuv 1.6.6 2022-09-08 [2] CRAN (R 4.2.2) igraph 1.3.5.9038 2022-11-07 [2] Github (igraph/rigraph@b765cf6) IRanges 2.32.0 2022-11-01 [1] Bioconductor later 1.3.0 2021-08-18 [2] CRAN (R 4.2.2) lattice 0.20-45 2021-09-22 [4] CRAN (R 4.2.0) lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.2) logistf 1.24.1 2022-01-18 [2] CRAN (R 4.2.2) magrittr 2.0.3 2022-03-30 [2] CRAN (R 4.2.2) MASS 7.3-58.1 2022-08-03 [1] CRAN (R 4.2.2) Matrix 1.5-1 2022-09-13 [4] CRAN (R 4.2.1) memoise 2.0.1 2021-11-26 [2] CRAN (R 4.2.2) mgcv 1.8-41 2022-10-21 [4] CRAN (R 4.2.1) mice 3.14.0 2021-11-24 [2] CRAN (R 4.2.2) mime 0.12 2021-09-28 [2] CRAN (R 4.2.2) miniUI 0.1.1.1 2018-05-18 [2] CRAN (R 4.2.2) munsell 0.5.0 2018-06-12 [2] CRAN (R 4.2.2) nlme 3.1-160 2022-10-10 [4] CRAN (R 4.2.1) operator.tools 1.6.3 2017-02-28 [1] CRAN (R 4.2.2) permute 0.9-7 2022-01-27 [1] CRAN (R 4.2.2) pillar 1.8.1 2022-08-19 [2] CRAN (R 4.2.2) pkgbuild 1.3.1 2021-12-20 [2] CRAN (R 4.2.2) pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.2.2) pkgload 1.3.1 2022-10-28 [1] CRAN (R 4.2.2) plyr 1.8.7 2022-03-24 [2] CRAN (R 4.2.2) prettyunits 1.1.1 2020-01-24 [2] CRAN (R 4.2.2) processx 3.8.0 2022-10-26 [2] CRAN (R 4.2.2) profvis 0.3.7 2020-11-02 [2] CRAN (R 4.2.2) promises 1.2.0.1 2021-02-11 [2] CRAN (R 4.2.2) ps 1.7.2 2022-10-26 [1] CRAN (R 4.2.2) purrr 0.3.5 2022-10-06 [2] CRAN (R 4.2.2) R6 2.5.1 2021-08-19 [2] CRAN (R 4.2.2) radiator 1.2.3 2022-11-07 [2] Github (thierrygosselin/radiator@7f33802) Rcpp 1.0.9 2022-07-08 [1] CRAN (R 4.2.2) RCurl 1.98-1.9 2022-10-03 [1] CRAN (R 4.2.2) readr 2.1.3 2022-10-01 [1] CRAN (R 4.2.2) remotes 2.4.2 2021-11-30 [1] CRAN (R 4.2.2) reshape2 1.4.4 2020-04-09 [2] CRAN (R 4.2.2) rlang 1.0.6 2022-09-24 [1] CRAN (R 4.2.2) S4Vectors 0.36.0 2022-11-01 [1] Bioconductor scales 1.2.1 2022-08-20 [2] CRAN (R 4.2.2) SeqArray 1.38.0 2022-11-01 [2] Bioconductor seqinr 4.2-16 2022-05-19 [1] CRAN (R 4.2.2) SeqVarTools 1.36.0 2022-11-01 [2] Bioconductor sessioninfo 1.2.2 2021-12-06 [2] CRAN (R 4.2.2) shiny 1.7.3 2022-10-25 [2] CRAN (R 4.2.2) stringi 1.7.8 2022-07-11 [1] CRAN (R 4.2.2) stringr 1.4.1 2022-08-20 [2] CRAN (R 4.2.2) tibble 3.1.8 2022-07-22 [2] CRAN (R 4.2.2) tidyr 1.2.1 2022-09-08 [2] CRAN (R 4.2.2) tidyselect 1.2.0 2022-10-10 [2] CRAN (R 4.2.2) tzdb 0.3.0 2022-03-28 [1] CRAN (R 4.2.2) urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.2.2) usethis 2.1.6 2022-05-25 [2] CRAN (R 4.2.2) utf8 1.2.2 2021-07-24 [2] CRAN (R 4.2.2) vctrs 0.5.0 2022-10-22 [1] CRAN (R 4.2.2) vegan 2.6-4 2022-10-11 [2] CRAN (R 4.2.2) vroom 1.6.0 2022-09-30 [1] CRAN (R 4.2.2) withr 2.5.0 2022-03-03 [2] CRAN (R 4.2.2) xtable 1.8-4 2019-04-21 [2] CRAN (R 4.2.2) XVector 0.38.0 2022-11-01 [1] Bioconductor zlibbioc 1.44.0 2022-11-01 [1] Bioconductor

thierrygosselin commented 1 year ago

Dear @ymatmt,

Do you mind sharing your plink file via email, because my plink file works with the function. Also, can you run this command and copy/paste the output

test1 <- radiator::read_plink(data = "plink.bed")

Thanks for reporting the bug Thierry

thierrygosselin commented 1 year ago

Should work now with version 1.2.4.

stephaniehall123 commented 1 year ago

I'm having the exact same issue - I updated to version 1.2.4 but still not working for me. I tried using plink .bed file and also vcf file and had the same error.

Code: genepop = genomic_converter("pruned_estuary_Miramichi_final.bed", strata = "strata.txt", output = "genepop", filename = "pruned_estuary_Miramichi_filtered")

Error: Execution date@time: 20230104@1117 Folder created: 17_radiator_genomic_converter_20230104@1117 Function call and arguments stored in: radiator_genomic_converter_args_20230104@1117.tsv Filters parameters file generated: filters_parameters_20230104@1117.tsv Reading PLINK bed file...

Data summary: number of samples: 120 number of markers: 17698 Error in SeqArray::seqGetData(gdsfile = data, var.name = "$ref") : The GDS node "$ref" does not exist.

Computation time, overall: 1 sec

Computation time, overall: 1 sec

thierrygosselin commented 1 year ago

I cannot reproduce this error Will have my computer with more datasets and tests 2022/01/16

In the meantime, try other conversion software

Sorry for the inconvenience Thierry

jcaccavo commented 1 year ago

Hi Thierry and others with this issue,

I too have this same issue when trying to run radiator::filter_rad. For me, it's with a .vcf file produced in Stacks as the unput.

You can download the .vcf file from my dropbox, as well as the strata file.

I have radiator version 1.2.5.

I tried just reading the vcf file (radiator::read_vcf), and also got the same error.

Below are the commands I used and the radiator output (including this error).

Thanks in advance for your help! :)

data <- radiator::filter_rad(data = "3_subarea_p3_p1r0.6_populations.snps.vcf", strata = "strata_subarea.tsv", output = "tidy", interactive.filter = TRUE, verbose = TRUE, parallel.core = parallel::detectCores() - 1)

################################################################################
############################# radiator::filter_rad #############################
################################################################################
# Execution date@time: 20230127@1156
# Folder created: filter_rad_20230127@1156
# Function call and arguments stored in: radiator_filter_rad_args_20230127@1156.tsv
# File written: random.seed (452882)                                  
# Filters parameters file generated: filters_parameters_20230127@1156.tsv
# Warning in SeqArray::seqVCF_Header(vcf.fn = vcf) :                  
#   There are too many lines in the header (>= 10000). In order not to slow down the conversion, please consider deleting unnecessary annotations (like contig).
# Warning in SeqArray::seqVCF_Header(vcf.fn = vcf) :
#   There are too many lines in the header (>= 20000). In order not to slow down the conversion, please consider deleting unnecessary annotations (like contig).
# Warning in SeqArray::seqVCF_Header(vcf.fn = vcf) :
#   There are too many lines in the header (>= 30000). In order not to slow down the conversion, please consider deleting unnecessary annotations (like contig).
# Warning in SeqArray::seqVCF_Header(vcf.fn = vcf) :
#   There are too many lines in the header (>= 40000). In order not to slow down the conversion, please consider deleting unnecessary annotations (like contig).
# ✔ Reading VCF [6m 32.2s]
# Error in SeqArray::seqGetData(gdsfile = data, var.name = "$ref") : 
#   The GDS node "$ref" does not exist.
# 
# Computation time, overall: 392 sec
############################# completed filter_rad #############################

test1 <- radiator::read_vcf("3_subarea_p3_p1r0.6_populations.snps.vcf")

################################################################################
############################## radiator::read_vcf ##############################
################################################################################
# Execution date@time: 20230127@1902
# Folder created: read_vcf_20230127@1902
# Function call and arguments stored in: radiator_read_vcf_args_20230127@1902.tsv
# File written: random.seed (679284)                                  
# Warning in SeqArray::seqVCF_Header(vcf.fn = vcf) :
#   There are too many lines in the header (>= 10000). In order not to slow down the conversion, please consider deleting unnecessary annotations (like contig).
# Warning in SeqArray::seqVCF_Header(vcf.fn = vcf) :
#   There are too many lines in the header (>= 20000). In order not to slow down the conversion, please consider deleting unnecessary annotations (like contig).
# Warning in SeqArray::seqVCF_Header(vcf.fn = vcf) :
#   There are too many lines in the header (>= 30000). In order not to slow down the conversion, please consider deleting unnecessary annotations (like contig).
# Warning in SeqArray::seqVCF_Header(vcf.fn = vcf) :
#   There are too many lines in the header (>= 40000). In order not to slow down the conversion, please consider deleting unnecessary annotations (like contig).
# ✔ Reading VCF [6m 50.1s]
# Analyzing VCF
# VCF source: Stacks v2.61
# Error in SeqArray::seqGetData(gdsfile = data, var.name = "$ref") : 
#   The GDS node "$ref" does not exist.
# 
# Computation time, overall: 410 sec
# ############################## completed read_vcf ##############################