marbl / CHM13

The complete sequence of a human genome
Other
883 stars 96 forks source link

Imputation panels #41

Open gevro opened 2 years ago

gevro commented 2 years ago

Hi, Are you aware of any CHM13 based imputation panels?

rmccoy7541 commented 2 years ago

Thanks for your interest, and sorry for the slow reply. The T2T Variants team recently produced genotypes for the 1000 Genomes Project samples, which are described in this paper: https://www.biorxiv.org/content/10.1101/2021.07.12.452063v1.full

The "Data and materials availability" section provides links to download these data. Note that the genotypes are currently unphased, but we are currently working on phasing.

JosephLalli commented 1 year ago

@rmccoy7541,

With the release of the 1KGP genotypes, do you know someone in the consortium is working on phasing the calls? Do you all anticipate the phasing being performed by, say, April?

(If not, I might be interested in helping out on that project.)

rmccoy7541 commented 1 year ago

Hi @JosephLalli,

I can confirm that we are working on this, and I apologize that it has taken so long. While we think we have worked out how to replicate phasing pipeline used previously by the NYGC, we are now trying to evaluate the accuracy of the phasing results and their consistency with earlier phased genotype data generated using other reference genomes. We welcome your help in this effort if you're interested. If you email me (rajiv.mccoy jhu.edu) and my student, Andrew Bortvin (abortvi2 jhu.edu), we can share more details about the current status.

Best,

Rajiv

JosephLalli commented 1 year ago

I've reached out to you via email @rmccoy7541.

For those who may be interested, I've taken the liberty of phasing the T2T dataset myself. The repository, describing the methods used and performance metrics, can be found here.

The panel itself can be downloaded as a tarball from a Zenodo repository located here.