Using GSEA Gene Set Annotations

saxovocal commented 3 years ago

Hi I am able to run the programme without problems, but want to ask in terms of the usage of the programme.

When creating an iDEA object, one would need to insert the gene annotations. Say I'd like to run three gene sets from msigdbr (hallmark, immune, and all GO Terms), should I run them separately, or all together? If I run them together, 20 cores take 12 hours for one dataset, and the results seem to be different as supposed to if I run hallmark set only separately.

Thanks,

W

YingMa0107 commented 3 years ago

Hi @saxovocal,

Thanks for your interest in applying iDEA. Could you please tell me at which step the results are different when you run them separately? I've tested on the example dataset when I use random 50 and subset of 10 gene sets, the estimation of annotation coefficient is quite similar. Since there is MCMC sampling procedure when estimating the parameters in the package, so the numbers are not eactly the same.

If you run the FDR estimation step and observed a difference, this might be due to the permutation step. I recommend you to run them together to construct a reliable null distribution for FDR estimation.

Best, Ying

denvercal1234GitHub commented 4 months ago

Hi @YingMa0107 - Thanks for the tool! Could you give us an example code if we want to run multiple gene set annotations at once in the CreateiDEAObject()?

xzhoulab / iDEA

Using GSEA Gene Set Annotations #12