kkdey / GSSG

Gene Set + S2G strategy annotations analyzed for disease architecture
45 stars 12 forks source link

Step 2 SNP annotations for sc-linker #22

Closed happyhamm closed 1 year ago

happyhamm commented 1 year ago

Hi @kkdey - Thanks so much for creating this great toolkit! We're excited to implement this for our studies.

Following the instructions for "STEP 2: Gene set/program to SNP annotation" using sc-linker, gives annotations with the expected/listed directory output containing "100kb" and "ABCRoad{enhancer_tissue}" (e.g. an output named "ABC_Road_GI"). How do I recreate the results posted in the publication?

  1. Is "ABC_Road_GI"from the "ABC_Road_GI_GI" that's published?
  2. How do I generate results for - "ABC_Road_GI_GI" and "ABC_Road_I_GI" ?
  3. How do I generate results for "ABC_Road_ALL"?

Thank you so much!

kkdey commented 1 year ago

Hi @happyhamm , 1. Yes, it is the same, 2. You can ignore ABC_Road_I_GI where ABC_Road_GI_GI is present, it was corresponding to an older version, 3. For ALL, you use all rows of ABC and Roadmap files, corresponding to all tissues, see the code here: https://github.com/kkdey/GSSG/blob/master/code/GeneSet_toS2G/all_bedgraph_methods.R

happyhamm commented 1 year ago

Very helpful - Thanks so much, @kkdey!

happyhamm commented 1 year ago

Hi @kkdey, I tried running step2 and step3 starting from the posted gene scores linked below, and found that I was getting slightly different enrichment and tau-star values than what the publication found. The results were well correlated, but final values were slightly off.

gene score link we used: https://alkesgroup.broadinstitute.org/LDSCORE/Jagadeesh_Dey_sclinker/gene_scores/

sclinker results we compared against: https://alkesgroup.broadinstitute.org/LDSCORE/Jagadeesh_Dey_sclinker/sclinker_results/cell_type_programs/

Any suggestions on why this might be the case? Thanks!

happyhamm commented 1 year ago

Tracked this issue to differences in LDSC versions.