zhengxwen / SNPRelate

R package: parallel computing toolset for relatedness and principal component analysis of SNP data (Development version only)
http://www.bioconductor.org/packages/SNPRelate
98 stars 25 forks source link

Error with snpgdsVCF2GDS #34

Open annat22 opened 6 years ago

annat22 commented 6 years ago

Hello,

I'm getting this error with SNPRelate 1.12.2 and gdsfmt 1.14.1 running snpgdsVCF2GDS

Error in scan.vcf.marker(fn, method) : The file (Joint_allSNPjointMAF05.vcf.gz) has different numbers of columns.

It happens consistently with two very different datasets. The only suggestion I could find online was to remove some of the info lines in the vcf header, which seems somewhat of a brute-force fix.

Any input would be greatly appreciated

Thanks Annat

zhengxwen commented 6 years ago

Could you please try the SeqArray package?

SeqArray::seqVCF2GDS() could provide you more error information. And try SeqArray::seqVCF_Header() which returns the objects of VCF header. You could revise the R object of VCF header to remove some info annotation, and pass it to SeqArray::seqVCF2GDS(, header=...).