zhengxwen / SeqArray

Data management of large-scale whole-genome sequence variant calls (Development version only)
http://www.bioconductor.org/packages/SeqArray
43 stars 12 forks source link

Estimating IBD using MLE #73

Open BELKHIR opened 2 years ago

BELKHIR commented 2 years ago

Hi,

I'm trying to run snpgdsIBDMLE function from the SNPRelate package on a seqArray obtained from your example : file <- seqOpen(seqExampleFileName("KG_Phase1")) mlibd = snpgdsIBDMLE(file,num.thread=8)

This return : Identity-By-Descent analysis (MLE) on genotypes: Calculating allele counts/frequencies ... [==================================================] 100%, completed, 0s (process 1) Excluding 122 SNVs (monomorphic: TRUE, MAF: NaN, missing rate: NaN) Working space: 1,092 samples, 19,651 SNVs using 8 (CPU) cores Error in snpgdsIBDMLE(file, num.thread = 8) : Invalid position in CIndex.

However if I convert the seqArray file object to SNP GDS file this function works fine.

I'm missing some thing ?

Regards,

zhengxwen commented 2 years ago

If you specify a list of SNP IDs in snpgdsIBDMLE(), it works.

BELKHIR commented 2 years ago

Thank's ! It can be a temporary solution.

Best regards

BELKHIR commented 2 years ago

Hi,

There is still a problem if the function is called for a subset of snp:

file <- seqOpen(seqExampleFileName("KG_Phase1")) variant.id <- seqGetData(file, "variant.id") mlibd = snpgdsIBDMLE(file, snp.id = sample(variant.id, 100), num.thread=8)

Identity-By-Descent analysis (MLE) on genotypes: Calculating allele counts/frequencies ... [==================================================] 100%, completed, 0s (process 1) Working space: 1,092 samples, 100 SNVs using 8 (CPU) cores Error in snpgdsIBDMLE(file, snp.id = sample(variant.id, 100), num.thread = 8) : Invalid position in CIndex.

Best regards,