zhengxwen / SNPRelate

R package: parallel computing toolset for relatedness and principal component analysis of SNP data (Development version only)
http://www.bioconductor.org/packages/SNPRelate
98 stars 25 forks source link

offset for grm in snpgdsGRM(method="IndivBeta") #62

Closed jgx65 closed 5 years ago

jgx65 commented 5 years ago

Hi,

Is there a possibility to output (and store) the minimum used for the grm when method="IndivBeta"? Currently, avg_val is produced, but it would be useful to have also min_beta in order to be able to reconstruct the initial grm without the offset. As it stands, it is not possible.

An alternative would be to add an option to snpgdsIndivBeta allowing to output a gds file [I need this in order to use the GAC pipeline with this GRM]

Thanks for your great work!

zhengxwen commented 5 years ago

Please use snpgdsGRM(, method="IndivBeta"):

 "IndivBeta": `beta = snpgdsIndivBeta(, inbreeding=TRUE)`, and
 beta-based GRM is $grm_ij = 2 * (beta_ij - beta_min) / (1 -
 beta_min)$ for $i!=j$, $grm_ij = 1 + (beta_i - beta_min) / (1 -
 beta_min)$ for $i=j$. It is relative to the minimum value of beta
 estimates.
jgx65 commented 5 years ago

Thanks. But I want the original beta grm, not that with a minimum set at zero. I cannot recover this minimum if I use snpgdsGRM(,method="IndivBeta") . I can calculate it with snpgdsIndivBeta, but this function does not have an out.fnargument, which mean I can't use it simply with the topmed pipeline. I found a dirty fix, by first saving the object produced by snpgdsIndivBeta and then converting it to a gds file, but it is not very practical.

zhengxwen commented 5 years ago
library(SNPRelate)
genofile <- snpgdsOpen(snpgdsExampleFileName())
snpgdsGRM(genofile, method="IndivBeta", out.fn="grm.gds")

The grm.gds contains the average M_ij/n_loci (i != j) storing in avg_val. Using this avg_val could recover the minimum.

File: grm.gds (291.5K)
+    [  ] *
|--+ command   { Str8 2, 30B }
|--+ sample.id   { Str8 279 LZMA_ra(30.5%), 701B }
|--+ snp.id   { Int32 8722 LZMA_ra(10.2%), 3.5K }
|--+ grm   { Float64 279x279 LZMA_ra(47.1%), 286.2K }
\--+ avg_val   { Float64 1, 8B }
jgx65 commented 5 years ago

I found the (simple) solution, sorry for this