ctlab / fgsea

Fast Gene Set Enrichment Analysis
Other
379 stars 67 forks source link

Error in if (maxP > -minP) { : missing value where TRUE/FALSE needed #82

Closed nfancy closed 4 years ago

nfancy commented 4 years ago

Hi Thank you very much for the package. I've recently updated the version and getting this error. Can you please shade some light?

Here are my command and error:

database_list <- list.files(database_dir, pattern = ".gmt", full.names = TRUE)
pathways <- fgsea::gmtPathways(database_list[1])

de <- read.delim("de_table.tsv")
ranked_gene <- de %>% 
  mutate(stat = sign(logFC)* -log10(padj)) %>%
  dplyr::select(gene, stat) %>% 
  arrange(desc(stat)) %>%
  tibble::deframe()

fgseaRes <- fgsea::fgseaSimple(pathways=pathways, stats=ranked_gene, nperm = 1000)

Error in if (maxP > -minP) { : missing value where TRUE/FALSE needed
In addition: Warning message:
In preparePathwaysAndStats(pathways, stats, minSize, maxSize, gseaParam,  :
  There are ties in the preranked stats (3.77% of the list).
The order of those tied genes will be arbitrary, which may produce unexpected results.

Thanks in advance.

assaron commented 4 years ago

Can you please post your data, so I can reproduce the error?

On Mon, Oct 19, 2020 at 3:47 PM nfancy notifications@github.com wrote:

Hi Thank you very much for the package. I've recently updated the version and getting this error. Can you please shade some light?

Here are my command and error:

database_list <- list.files(database_dir, pattern = ".gmt", full.names = TRUE) pathways <- fgsea::gmtPathways(database_list[1])

de <- read.delim("de_table.tsv") ranked_gene <- de %>% mutate(stat = sign(logFC)* -log10(padj)) %>% dplyr::select(gene, stat) %>% arrange(desc(stat)) %>% tibble::deframe()

fgseaRes <- fgsea::fgseaSimple(pathways=pathways, stats=ranked_gene, nperm = 1000)

Error in if (maxP > -minP) { : missing value where TRUE/FALSE needed In addition: Warning message: In preparePathwaysAndStats(pathways, stats, minSize, maxSize, gseaParam, : There are ties in the preranked stats (3.77% of the list). The order of those tied genes will be arbitrary, which may produce unexpected results.

Thanks in advance.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ctlab/fgsea/issues/82, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAH56WNZB7PKNGRU36HO5B3SLQYO3ANCNFSM4SWDFH7A .

nfancy commented 4 years ago

hi, the data is attached. This is the command that gives me the error message:

> fgseaRes <- fgsea::fgsea(pathways=geneset, stats=ranked_gene,
+                          nperm=1000, maxSize = 1000)
Error in if (maxP > -minP) { : missing value where TRUE/FALSE needed
In addition: Warning message:
In fgsea::fgsea(pathways = geneset, stats = ranked_gene, nperm = 1000,  :
  There are ties in the preranked stats (3.57% of the list).
The order of those tied genes will be arbitrary, which may produce unexpected results.

However, later I figured out, the following command runs the function without an error. Can you please explain what does gseaParam do?

fgseaRes <- fgsea::fgsea(pathways=geneset, stats=ranked_gene,
+                          nperm=1000, maxSize = 1000, gseaParam = 0)
Warning message:
In fgsea::fgsea(pathways = geneset, stats = ranked_gene, nperm = 1000,  :
  There are ties in the preranked stats (3.57% of the list).
The order of those tied genes will be arbitrary, which may produce unexpected results.

Many thanks for your response.

data.zip

assaron commented 4 years ago

This happens due to an infinite value in your ranked_gene stats. Setting gseaParam to zero keeps the order, but sets all the stats to 1, by raising all the values in to the power of zero.

We'll add a check, so that the error message will be more apropriate.

assaron commented 4 years ago

We've added the check in 07f86379b5c6fcbeca82e2f6937ec527eae95b42

nfancy commented 4 years ago

thanks a lot. This brings another question though. What is the appropriate gseaParam value?

assaron commented 4 years ago

This parameters controls how gene weights are considered: the weights absolute values are raised to the power of gseaParam after ordering. gseaParam of zero make all weight equal, resulting in a Kolmogorov-Smirnov test, gseaParam of one make the weights to be used as is. Usually there is no need to change it from the default value of 1, unless you use some weird metrics or have a specific reason in mind.