Closed lindington closed 1 year ago
Hi,
I am also having this issue, is there a way to resolve it and continue the filtering?
Thanks, Jess
Hi,
I'm encountering the same problem. Have you been able to solve it?
Hello,
I'am having also the same kind of issue... I am trying to use the fonction tidy_genomic_data in radiator to convert a VCF file to a tidy.vcf format. I had already successfully converted the same vcf file with this function in a previous version of radiator (v1.1.5).
Installation Windows Server 2012 R2 Standard R version 4.2.1 radiator_1.2.2 SeqArray_1.36.3
Code
library(radiator) tidy.vcf <- tidy_genomic_data(data = "HapMap_MD_S80I25_MAF0.01_He0.6_hwe_SnpD_IndD_1SNP_LD_1156_Assigner_20200501.vcf", strata = "Pop_assigner_19g_updtae_20220915.txt") ################################################################################ ######################### radiator::tidy_genomic_data ########################## ################################################################################ Execution date@time: 20220922@1858 Folder created: -8_radiator_tidy_genomic_20220922@1858 Function call and arguments stored in: radiator_tidy_genomic_data_args_20220922@1858.tsv Analyzing strata file Number of strata: 19 Number of individuals: 1156 Importing and tidying the VCF... Execution date@time: 20220922@1858
Reading VCF...
Data summary: number of samples: 1156 number of markers: 2340 Error in SeqArray::seqGetData(gdsfile = data, var.name = "$ref") : The GDS node "$ref" does not exist.
Computation time, overall: 18 sec ############################## completed tidy_vcf ##############################
Computation time, overall: 18 sec ######################### completed tidy_genomic_data ##########################
Interestingly when I used a subset of my dataset (only 116 SNPs with 1156 individuals), I have another message error:
tidy.vcf <- tidy_genomic_data(data = "/Users/Admin/Documents/HapMap_1156_subset_Assigner_20220922.vcf", strata = "/Users/Admin/Documents/Pop_assigner_19g_updtae_20220915.txt") ################################################################################ ######################### radiator::tidy_genomic_data ########################## ################################################################################ Execution date@time: 20220922@2008 Folder created: 00_radiator_tidy_genomic_20220922@2008 Function call and arguments stored in: radiator_tidy_genomic_data_args_20220922@2008.tsv Analyzing strata file Number of strata: 19 Number of individuals: 1156 Importing and tidying the VCF... Execution date@time: 20220922@2008
Reading VCF...
Data summary: number of samples: 1156 number of markers: 116
Filter monomorphic markers Number of individuals / strata / chrom / locus / SNP: Blacklisted: 0 / 0 / 0 / 0 / 0 Error in SeqArray::seqMissing(gdsfile = x, per.variant = TRUE, .progress = FALSE, : unused argument (.progress = FALSE) In addition: Warning message: In as.POSIXlt.POSIXct(x, tz) : unable to identify current timezone 'C': please set environment variable 'TZ'
Computation time, overall: 2 sec ############################## completed tidy_vcf ##############################
Computation time, overall: 2 sec ######################### completed tidy_genomic_data ##########################
Fichiers file.zip
Thanks for the help !
Hi all,
I have solved the problem with the function do.call from the R.utils package. This does not return an error if unused arguments are passed in the functions of the package SeqArray. Indeed, several changes in the arguments of the functions in the new version of the package SeqArray are not yet taken into account in the current version of the radiator package.
R.utils::doCall(genomic_converter("file.vcf",output = "file.hzar", verbose = TRUE))
but the bad news is that I still get the following error
Error in SeqArray::seqGetData(gdsfile = data, var.name = "$ref") :
The GDS node "$ref" does not exist.
Hi Thierry,
I am seeing the what appears to be the same error when running radiator::filter_rad()
data <- filter_rad(data = "populations.snps.vcf", interactive.filter = TRUE, output = c("vcf", "plink"), filename = NULL, verbose = TRUE, parallel.core = parallel::detectCores() - 1)
Reading VCF...
Data summary:
number of samples: 141
number of markers: 43678
Generating individual stats...
Generating markers stats...
Error in SeqArray::seqAlleleCount(gdsfile = gds, ref.allele = NULL, .progress = TRUE, : unused argument (.progress = TRUE)
I tried to find a [workaround], downgrading SeqArray etc. but that did not work.
Ended up forking (zjons/radiator) and changing ".progress = TRUE" appeared to "verbose = TRUE" in every call to SeqArray:: that I could find (in gds.R, filter_mac.R and filter_common_markers.R) where. That seems to do the trick. At least I can read in the file now.
Should work with v.1.2.3
Hi Thierry,
I am trying to load a vcf file into radiator, ultimately to convert into hzar input format. I think there is an issue with the argument
.progress=True
in the SeqArray functions used in the radiator code. This argument seems to be changed toverbose
in the newest SeqArray version.From the SeqArray Vignette:
Arguments
gdsfile
a SeqVarGDSClass objectref.allele
NULL, a single numeric value, a numeric vector or a character vector; see Valueminor
if TRUE, return minor allele frequency/countparallel
FALSE (serial processing), TRUE (multicore processing), numeric value or other value; parallel is passed to the argument cl in seqParallel, see seqParallel for more details.verbose
if TRUE, show progress informationMy error: When using
tidy_vcf(data="file.vcf", parallel.core=1, output="file.hzar", verbose = TRUE)
orgenomic_converter("file.vcf",output = "file.hzar", verbose = TRUE)
I get the following error:
My session info: