broadinstitute / gnomad_local_ancestry

Hail batch pipeline and scripts for local ancestry inference
MIT License
3 stars 0 forks source link

Estimate initial global ancestry proportion #34

Closed mike-w-wilson closed 3 years ago

mike-w-wilson commented 3 years ago

Use ADMIXTURE to sanity check admixture proportions

mike-w-wilson commented 3 years ago

use ADMIXTURE

mike-w-wilson commented 3 years ago

Ran Alicia's lai_global script after creating bed files with collapse_ancestry.py as a sanity check to see how the current pipeline matches up with admixtures k=3 output. Hexplots attached for each ancestry.

Parsed lai_globals output into homogeneous, 2-way admixed, and 3-way admixed using a 5% ancestry inclusion threshold. 5% of gnomAD labeled amr individuals are homogeneous, 60% are 2-way admixed, and 35% are 3-way admixed. Of the 5% homogeneous, ~110 are homogeneous for NatAm ancestry while ~250 are EUR ancestry.

pong_amr_k10.pngafr_prop.pngnat_prop.pngeur_prop.pngcv_k10.png