KangchengHou / admix-kit

Toolkit for analyzing genetics data from admixed populations
https://kangchenghou.github.io/admix-kit
22 stars 5 forks source link

use plink2 as the storage engine #3

Closed KangchengHou closed 2 years ago

KangchengHou commented 2 years ago

Instead of using zarr as the engine, consider using plink2

It should support random access, see https://github.com/chrchang/plink-ng/tree/master/2.0/Python

the big advantage of it is it can enjoy so many usage of plink2

To check whether plink2 support storing phasing information.

Other than this, we can have another file to store local ancestries.

The first step may be to implement similar functions with pandas-plink