KangchengHou / admix-kit

Toolkit for analyzing genetics data from admixed populations
https://kangchenghou.github.io/admix-kit
22 stars 5 forks source link

beagle vs. eagle2 genetic map #17

Closed KangchengHou closed 1 year ago

KangchengHou commented 1 year ago

currently we use beagle genetic map for rfmix, eagle2 genetic map for hapgen. Ideal to leave only 1. TODO: check consistency between these 2 types of genetic map. Have a x-y plot where we put genetic position vs. physical position and compare across 2 data source.

KangchengHou commented 1 year ago

From these results, beagle and eagle2 genetic map are quite similar. We can just leave eagle2 genetic map (as eagle2 genetic map also provides local COMBINED_rate.

image

Use following code to reproduce the figure.

from scipy import stats
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# download beagle
!wget https://bochet.gcc.biostat.washington.edu/beagle/genetic_maps/plink.GRCh38.map.zip
!unzip plink.GRCh38.map.zip

# download eagle map
!wget https://storage.googleapis.com/broad-alkesgroup-public/Eagle/downloads/tables/genetic_map_hg38_withX.txt.gz

eagle_map = pd.read_csv("genetic_map_hg38_withX.txt.gz", delim_whitespace=True)

nrows, ncols = 4, 6
fig, axes = plt.subplots(
    figsize=(3 * ncols, 3 * nrows), dpi=150, nrows=nrows, ncols=ncols
)

for chrom in range(1, 23):
    ax = axes.flatten()[chrom - 1]
    chrom_beagle_map = pd.read_csv(
        f"plink.chr{chrom}.GRCh38.map",
        delim_whitespace=True,
        header=None,
    )
    chrom_eagle_map = eagle_map[eagle_map["chr"] == chrom]

    ax.scatter(
        chrom_beagle_map[3], chrom_beagle_map[2], s=0.5, alpha=0.01, label="Beagle"
    )
    ax.scatter(
        chrom_eagle_map["position"],
        chrom_eagle_map["Genetic_Map(cM)"],
        s=0.5,
        alpha=0.01,
        label="Eagle",
    )
    ax.set_xlabel("Physical position")
    ax.set_ylabel("Genetic position")
    ax.set_title(f"Chromosome {chrom}")
    legend = ax.legend()
    for handle in legend.legendHandles:
        handle.set_sizes([6.0])
        handle.set_alpha(1.0)
fig.tight_layout()
plt.show()