weizhouUMICH / SAIGE

GNU Lesser General Public License v3.0
187 stars 72 forks source link

numRandomMarkerforSparseKin and relatednessCutoff in GRM generation #398

Closed Shicheng-Guo closed 2 years ago

Shicheng-Guo commented 2 years ago

Dear Wei,

Happy New Year!! Do we have best recommendation for GRM generation, for example, minMAFforGRM , numRandomMarkerforSparseKin, relatednessCutoff

Thanks.

Shicheng

createSparseGRM( plinkFile = "", outputPrefix = "", numRandomMarkerforSparseKin = 1000, relatednessCutoff = 0.125, memoryChunk = 2, isDiagofKinSetAsOne = FALSE, nThreads = 1, minMAFforGRM = 0.01, isSetGeno = TRUE, isWritetoFiles = TRUE )

weizhouUMICH commented 2 years ago

Hi Shicheng,

We recommend using markers with MAF >= 1% for GRM. We've tried different sample relatedness cutoffs for the UKBB data for assoc tests and the results are quite similar. larger value for numRandomMarkerforSparseKin is better to decide which samples are related with t a tradeoff with computation time. Generally we see 1000 or 2000 works, but I think it depends on data. There are different choices to create a sparse GRM. https://saigegit.github.io//SAIGE-doc/docs/createSparseGRM.html

Thanks, Wei