KangchengHou / admix-kit

Toolkit for analyzing genetics data from admixed populations
https://kangchenghou.github.io/admix-kit
22 stars 5 forks source link

align af_chunk #5

Closed KangchengHou closed 2 years ago

KangchengHou commented 2 years ago

https://github.com/KangchengHou/admix-tools/blob/28b12364ade0cada77825b6b36bc4d9029ecfb11/admix/tools/__init__.py#L226-L230

This will be incorrect if af_chunk are in different chunks with the genotypes. Make sure the chunk align with each other.

KangchengHou commented 2 years ago
dset = xrpgen.read_pfile(PGEN_PATH, phase=True)
dset["lanc"] = (dset.geno.dims, da.from_zarr(PGEN_PATH.replace(".pgen", ".lanc")))
dset = dset.isel(snp=np.arange(2000))
dset.attrs["n_anc"] = 2

admix_genet_cor.af_per_anc(dset)
admix_genet_cor.allele_per_anc(dset, center=True)

dset2 = admix.data.load_lab_dataset("page_eur_afr_hm3", chrom=22)
dset2 = dset2.isel(snp=np.arange(2000))
admix.tools.allele_per_anc(dset2, center=True)

np.allclose(
    dset.allele_per_anc.values[:, :, 0].swapaxes(0, 1),
    dset2.allele_per_anc.values[:, :, 1],
    atol=1e-6,
) and np.allclose(
    dset.allele_per_anc.values[:, :, 1].swapaxes(0, 1),
    dset2.allele_per_anc.values[:, :, 0],
    atol=1e-6,
)
KangchengHou commented 2 years ago

closing because allele_per_anc no longer have centering option