rwdavies / QUILT

GNU General Public License v3.0
45 stars 10 forks source link

preparing data for imputation with QUILT #19

Open genaev opened 1 year ago

genaev commented 1 year ago

I would like to impute the data of another chip in order to get the missing points of the BovineSNP50 chip. My source data in ped format. I got VCF files using plink1.9 and prepared a reference (~5000 genotypes) using the ./QUILT_prepare_reference.R script. Now, in order to perform the imputation, I have to convert ped/VCF files received from other chips to SAM/BAM format. Please advise how this can be done? I tried using bedToBam from the bedtools package for this, but it doesn't seem to work correctly. Can you please recommend a way how to convert ped/VCF files to BAM format for using in QUILT?

rwdavies commented 1 year ago

Hi,

Apologies I missed this, I'm not sure how.

If this is still relevant, can you give a bit more context? QUILT is designed to impute from low coverage whole genome sequence data, for which you would have bams, which is the preferred input. This allows the method to distinguish between say when one read intersects two SNPs, vs two reads intersecting those two SNPs, which can't be otherwise distinguished except with read level data.

Otherwise, there are several other imputation software tools designed for DNA genotyping microarray inputs like various versions of Beagle or IMPUTE, and non-human specific tools like findhap, the alpha impute suite of tools, etc, which might be more appropriate for your needs.

Best, Robbie