Closed JamesFrifan closed 6 months ago
Thank you so much for catching this! Our team will get on fixing this immediately.
Get Outlook for iOShttps://aka.ms/o0ukef
From: James Fu @.> Sent: Wednesday, May 29, 2024 2:45:01 AM To: FertigLab/CoGAPS @.> Cc: Subscribed @.***> Subject: [FertigLab/CoGAPS] getPatternGeneSet gives wrong gene.set names (Issue #107)
Hi,
I am trying to run the function getPatternGeneSet on my CoGAPS results using the hallmark gene list from msigdbr, as demonstrated in the following code:
hallmark_df <- msigdbr(species = "Homo sapiens", category = "H") hallmark_list <- hallmark_df %>% split(x = .$gene_symbol, f = .$gs_name) hallmarks_ora <- getPatternGeneSet(cogaps_result, gene.sets = hallmark_list, method = "enrichment", threshold = "cut", lp = NA, axis = 1)
When looking at the gene set testing result for a given pattern (e.g. View(hallmarks_ora[[1]])), I found the pathway and gene.set columns are mismatched which I think is not an intended behavior. Please see the snapshot below: image.png (view on web)https://github.com/FertigLab/CoGAPS/assets/48193804/ec52b7b4-fe77-4fc3-9584-7f371e89c9b1
By looking at the code, I found the issue may be that fgsea functions return sorted results on the padj column rather than keeping the same order as the input gene.set. As a result, directly setting the gene set names from the input names will cause a mismatch. image.png (view on web)https://github.com/FertigLab/CoGAPS/assets/48193804/86198d5e-b2de-4ed5-8666-7ef27842bde8
I hope the description above is clear and should be helpful for you.
— Reply to this email directly, view it on GitHubhttps://github.com/FertigLab/CoGAPS/issues/107, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AATMMKZ6TRDXLC55QEH3MULZEV2O3AVCNFSM6AAAAABIOIXJ72VHI2DSMVQWIX3LMV43ASLTON2WKOZSGMZDENJQGA4TOMQ. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thanks @JamesFrifan, can you please share your sessionInfo()
?
Bug confirmed and did not happen on test data as padj
was tied in the test data. To reproduce:
gs.test <- list(
#add a hallmark with lower p-adj:
"low_p" = c("Hs.101174", "Hs.1012", "Hs.1019", "Hs.101937", "Hs.102484", "Hs.103110"),
"gs1" = c("Hs.2", "Hs.4", "Hs.36", "Hs.96", "Hs.202"),
"gs2" = c("Hs.699463", "Hs.699288", "Hs.699280", "Hs.699154", "Hs.697294")
)
hallmarks_ora <- getPatternGeneSet(object = GIST.result, gene.sets = gs.test, method = "enrichment")
> print(hallmarks_ora[[1]][,c(1,10)])
pathway gene.set
<char> <fctr>
1: gs1 low_p
2: gs2 gs1
3: low_p gs2
@jmitchell81
Hi,
I am trying to run the function
getPatternGeneSet
on my CoGAPS results using the hallmark gene list frommsigdbr
, as demonstrated in the following code:When looking at the gene set testing result for a given pattern (e.g.
View(hallmarks_ora[[1]])
), I found thepathway
andgene.set
columns are mismatched which I think is not an intended behavior. Please see the snapshot below:By looking at the code, I found the issue may be that
fgsea
functions return sorted results on thepadj
column rather than keeping the same order as the inputgene.set
. As a result, directly setting the gene set names from the input names will cause a mismatch.I hope the description above is clear and should be helpful for you.