KangchengHou / admix-kit

Toolkit for analyzing genetics data from admixed populations
https://kangchenghou.github.io/admix-kit
22 stars 5 forks source link

Numerical overflow #4

Closed yidingdd closed 2 years ago

yidingdd commented 3 years ago
import xarray as xr
dset_region = xr.open_zarr("/u/scratch/y/yiding/admixture-finemapping/data/ukb_eur_afr_imputed/1/1:1000000-2000000")
X = dset_region.geno.values
X[:,3,:].T @ X[:,3,:]
array([[ -7,  21],
       [ 21, -12]], dtype=int8)

The data type int8 leads to numerical overflow when computing XTX matrix

KangchengHou commented 2 years ago

This is fixed because genotype will be read as float now.