knausb / vcfR

Tools to work with variant call format files
249 stars 54 forks source link

Can't read GQ values from a freebayes VCF #113

Closed mthon closed 6 years ago

mthon commented 6 years ago

I'm going though some of the exercises from the workshop using my own VCF files generated with freebayes. when I try to make a matrix of GQ values, the values are all NA. I noticed that GQ values from freebayes are Floats while the example vcf files are Integers. Could this be the problem?

Here's a link to download an RStudio project to reproduce it: gq_problem

mthon commented 6 years ago

Actually, I think the problem is with the freebayes vcf file. It defines GQ in the meta info but they are not reported in the GT section of the VCF file. Sorry for the trouble...

knausb commented 6 years ago

Hi mthon,

If you specify as.numeric=TRUE in your call to extract.gt() you should get a numeric matrix. In R numerics include floats, so this should not be the problem.

I placed my recommendations for posting an informative issue here. What's typically critical is a minimal reproducible example.

If your vcfR object is named vcf you could look at the following.

table(vcf@gt[,1])
vcf@gt[1:6,1:4]

The table() command will count how many FORMAT records you have for each type you have. The second command will give you a peak at the top of your gt region, including the FORMAT column. If you do not see GQ in here then your interpretation is correct.

Thanks!