Open Nelson-Gon opened 2 years ago
I don't think it's necessary to modify it.
(1) When the input order of the genelist changes, the NES and pvalue/p.adjust/qvalues of the enrichment result keep still, except for rank
and leading_edge
. The rank
and leading_edge
are calculated by the order of the input genelist, so if you want enable GSEA
to support input in any order, these code should also be modified to ensure that the results are completely consistent.
(2) Maybe an easier way is to sort the user's input directly, and then do the following analysis.
I don't think it's necessary to modify it. (1) When the input order of the genelist changes, the NES and pvalue/p.adjust/qvalues of the enrichment result keep still, except for
rank
andleading_edge
. Therank
andleading_edge
are calculated by the order of the input genelist, so if you want enableGSEA
to support input in any order, these code should also be modified to ensure that the results are completely consistent. (2) Maybe an easier way is to sort the user's input directly, and then do the following analysis.
I am using this with user sorted data. The data is sorted in ascending order but GSEA using fgsea won't run because of the "should be sorted in decreasing order only" error.
Changing this part of the code has no effect on rank
and leading_edge
because we are only allowing a user to provide a geneList sorted in any order. What happens downstream of the input data has remained intact so I think this shouldn't result in any inconsistencies.
The idea behind is.Sorted
seems to check if what we have provided as a geneList
is sorted in descending order regardless of gsea
method. The PR only avoids this and does not affect anything else.
DOSE:::is.sorted
function (x, decreasing = TRUE)
{
all(sort(x, decreasing = decreasing) == x)
}
If the order of input is randomly sorted, the result(rank and leading_edge) will be affected: test_gsea.pdf
If the order of input is randomly sorted, the result(rank and leading_edge) will be affected: test_gsea.pdf
Thanks, I've only been looking at it visually and it seemed the graph would be a mirror image of that of the descending sort. We would expect genes at the left in a descending sort to show up on the right instead. From the output (e.g. gsedo2
), it seems this is the case (genes move one step further in the ranking). Only the ranking seems affected and not the leading edge as might be expected (gsedo
vs gsedo2
, not sorting randomly. Only ascending vs descending)
Your submission will bring hidden dangers, as I mentioned earlier.
Hi @huerqiang
I have taken a look at what fgsea
does and it seems the order does in fact not matter in which case the sorting may be irrelevant.
From https://github.com/ctlab/fgsea/issues/96 and other related issues though, it seems that changing sort to anything else would have resulted in similar results here so not sure what exactly causes the differences in output as you noted earlier.
@Nelson-Gon “The rank and leading_edge are calculated by the order of the input genelist, so if you want enable GSEA to support input in any order, these code should also be modified to ensure that the results are completely consistent.” https://github.com/YuLab-SMU/DOSE/blob/master/R/gsea.R#L356-L427
Hi,
This removes the need for a decreasing only sort if not using
DOSE
. Inis.sorted
decreasing was set toTRUE
which also affectedfgsea
.Thank you, NelsonGon