Closed thierrygosselin closed 6 years ago
Setting parallel = 1
reveal more info on the error:
Error in SeqArray::seqVCF2GDS(vcf.fn = "test.vcf", out.fn = "testing", :
FORMAT ID 'AD' (Number=R) should have 2 value(s), but receives 3.
FILE: test.vcf
LINE: 87, COLUMN: 17, ./.:1:1,0,0:1:22:0,0:0,0:0,-0.30103,-2.19784,-0.30103,-2.19784,-2.19784
The vcf is multi-allelic (haplotypes), however the variant in line 87 should be bi-allelic (C/T), so I supposed it comes down to how loose you want to be in your parsing interpretation of the VCF spec 4.2... I totally understand if you don't want to work around those issues.
See:
library(SeqArray)
seqVCF2GDS("test.vcf", "test.gds") # fails
hd <- seqVCF_Header("test.vcf")
hd$format
hd$format[hd$format$ID=="AD", ]
# ID Number Type Description
# 5 AD R Integer Number of observation for each allele
# change it
hd$format[hd$format$ID=="AD", "Number"] <- "."
hd$format[hd$format$ID=="AO", "Number"] <- "."
hd$format[hd$format$ID=="QA", "Number"] <- "."
hd$format[hd$format$ID=="QA", "Number"] <- "."
hd$format[hd$format$ID=="GL", "Number"] <- "."
seqVCF2GDS("test.vcf", "test.gds", header=hd) # works
much appreciated! Is there a way to cancel the import of some (e.g. AO or QA) so that it's not integrated in the GDS from the start ?
See:
nm <- hd$format$ID
nm <- setdiff(nm, c("AO", "QA"))
seqVCF2GDS("test.vcf", "test.gds", header=hd, fmt.import=nm)
thanks!
Hi Xiuwen,
I have another VCF file generating an error with SeqArray. This time, the vcf file was produced with freeBayes v1.1.0-4-gb6041c6
Here's the command:
The error:
The link to get the file
test.vcf
was sent by email.results of my
devtools::session_info()
:Cheers Thierry