zhengxwen / SeqArray

Data management of large-scale whole-genome sequence variant calls (Development version only)
http://www.bioconductor.org/packages/SeqArray
44 stars 12 forks source link

Error in haploid genotypes (Y chromosome) #1

Closed zhengxwen closed 10 years ago

zhengxwen commented 10 years ago

I'm trying to parse a VCF file with haploid genotypes, and got a segfault. Does SeqArray currently support haploid genotypes?

thanks, Stephanie

> library(SeqArray)
Loading required package: gdsfmt
>
> dir <- "/projects/geneva/gcc-fs2/1000Genomes/20130723_phase3_wg/stn"
> vcffile <- file.path(dir, "ALL.chrY.stanford_v1.20130502.snps.low_coverage.genotypes.vcf.gz")
> gdsfile <- file.path(dir, "ALL.chrY.stanford_v1.20130502.snps.low_coverage.genotypes.gds")
> seqVCF2GDS(vcffile, gdsfile)
The Variant Call Format (VCF) header:
       file format: VCFv4.1
       the number of sets of chromosomes (ploidy): 1
Parsing "/projects/geneva/gcc-fs2/1000Genomes/20130723_phase3_wg/stn/ALL.chrY.stanford_v1.20130502.snps.low_coverage.genotypes.vcf.gz" ...

*** caught segfault ***
address (nil), cause 'memory not mapped'

Traceback:
1: .Call("seq_Parse_VCF4", vcf.fn[i], header, gfile$root, list(sample.num = as.integer(length(samp.id)),     genotype.var.name = genotype.var.name, raise.error = raise.error,     verbose = verbose), readline, opfile, new.env())
2: seqVCF2GDS(vcffile, gdsfile)
aborting ...
/opt/gridengine/default/spool/p06/job_scripts/452947: line 16: 11881 Segmentation fault      (core dumped) R -q --vanilla < $1
zhengxwen commented 10 years ago

Fix it in SeqArray_1.5.2!