Open mick42-star opened 2 years ago
You did not provide your input data, but I suspect this is due to the fact that you use gene symbols as input. From my experience gene symbols are regularly duplicated (= the same gene symbol is used for 2 different genes; see below for examples), and as a result the ridgeplot
fails (because that plots all genes belonging to a gene set). The dotplot
will work because that plots only the results from the gene sets as such; no info on constituent genes is needed for such plot.
I suggest you use unique identifiers in all your analyses, such as entrez or ensembl ids.
>
> library(org.Hs.eg.db)
> ids <- keys(org.Hs.eg.db)
>
> mapping <- select(org.Hs.eg.db, keys=ids, columns=c('ENTREZID','SYMBOL'), keytype='ENTREZID')
'select()' returned 1:1 mapping between keys and columns
> mapping[duplicated(mapping[,"SYMBOL"]),]
ENTREZID SYMBOL
4795 6052 RNR1
4796 6053 RNR2
11493 51072 MEMO1
28771 100124696 TEC
30812 100187828 HBD
35761 100505381 MMD2
53263 107648861 DEL11P13
54541 107985615 TRNAV-CAC
54642 107985753 TRNAV-CAC
63864 122405565 SMIM44
64999 123670537 DEL1P36
>
> select(org.Hs.eg.db, keys="HBD", columns=c('ENTREZID','SYMBOL'), keytype='SYMBOL')
'select()' returned 1:many mapping between keys and columns
SYMBOL ENTREZID
1 HBD 3045
2 HBD 100187828
>
>
> select(org.Hs.eg.db, keys="MEMO1", columns=c('ENTREZID','SYMBOL'), keytype='SYMBOL')
'select()' returned 1:many mapping between keys and columns
SYMBOL ENTREZID
1 MEMO1 7795
2 MEMO1 51072
>
>
> select(org.Hs.eg.db, keys="TRNAV-CAC", columns=c('ENTREZID','SYMBOL'), keytype='SYMBOL')
'select()' returned 1:many mapping between keys and columns
SYMBOL ENTREZID
1 TRNAV-CAC 107985614
2 TRNAV-CAC 107985615
3 TRNAV-CAC 107985753
>
```>
Many thanks. It solved the problem when I converted all gene names to ENTREZID.
I had the same problem from this code
kegg_gene_list <- c(
7042= 0.365,
10135= 0.218,
3553= 0.175,
5291= 0.167,
114548= 0.163,
22861= 0.089,
4780= 0.078,
942= 0.061,
3902= -0.005,
3586= -0.011,
1029= -0.186,
3458` = -0.282
)
gse_res <- gseDO(kegg_gene_list, minGSSize = 5, pvalueCutoff = 0.2, pAdjustMethod = "BH", verbose = FALSE)
enrichplot::ridgeplot(gse_res)
`
You have, or had the same problem? What was the error message?
Anyway, make sure that the ids of your input are characters! The way you put it now the ids are considered numeric, and these are not recognized! Also note the presence of a back-tick (`) after the last id (3458).
After correcting this, and setting the significance cutoff to 1, it is working in my hands:
> ## load libraries
> library(clusterProfiler)
> library(DOSE)
> library(enrichplot)
>
> ## input, with ids as characters (and not numeric)
> kegg_gene_list <- c("7042"= 0.365,"10135"= 0.218,"3553"= 0.175,"5291"= 0.167, "114548"= 0.163,"22861"= 0.089,
+ "4780"= 0.078,"942"= 0.061, "3902"= -0.005,"3586"= -0.011,"1029"= -0.186,"3458"= -0.282)
>
> gse_res <- gseDO(kegg_gene_list,
+ minGSSize = 5,
+ pvalueCutoff = 1,
+ pAdjustMethod = "BH",
+ verbose = FALSE)
>
> ## check
> gse_res
#
# Gene Set Enrichment Analysis
#
#...@organism Homo sapiens
#...@setType DO
#...@keytype ENTREZID
#...@geneList Named num [1:12] 0.365 0.218 0.175 0.167 0.163 0.089 0.078 0.061 -0.005 -0.011 ...
- attr(*, "names")= chr [1:12] "7042" "10135" "3553" "5291" ...
#...nPerm
#...pvalues adjusted by 'BH' with cutoff <1
#...120 enriched terms found
'data.frame': 120 obs. of 11 variables:
$ ID : chr "DOID:2531" "DOID:0070004" "DOID:1909" "DOID:4960" ...
$ Description : chr "hematologic cancer" "myeloid neoplasm" "melanoma" "bone marrow cancer" ...
$ setSize : int 7 6 6 6 6 5 5 5 5 5 ...
$ enrichmentScore: num -0.781 -0.757 -0.757 -0.757 -0.757 ...
$ NES : num -1.73 -1.62 -1.62 -1.62 -1.62 ...
$ pvalue : num 0.0502 0.0388 0.0388 0.0388 0.0388 ...
$ p.adjust : num 0.589 0.589 0.589 0.589 0.589 ...
$ qvalue : num 0.589 0.589 0.589 0.589 0.589 ...
$ rank : num 7 6 6 6 6 4 4 4 4 4 ...
$ leading_edge : chr "tags=86%, list=58%, signal=86%" "tags=83%, list=50%, signal=83%" "tags=83%, list=50%, signal=83%" "tags=83%, list=50%, signal=83%" ...
$ core_enrichment: chr "4780/942/3902/3586/1029/3458" "942/3902/3586/1029/3458" "942/3902/3586/1029/3458" "942/3902/3586/1029/3458" ...
#...Citation
Guangchuang Yu, Li-Gen Wang, Guang-Rong Yan, Qing-Yu He. DOSE: an
R/Bioconductor package for Disease Ontology Semantic and Enrichment
analysis. Bioinformatics 2015, 31(4):608-609
>
> ## plot
> ridgeplot(gse_res)
Picking joint bandwidth of 0.0757
>
>
> sessionInfo()
R version 4.4.0 Patched (2024-05-21 r86580 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8
[2] LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8
time zone: Europe/Amsterdam
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] enrichplot_1.24.0 DOSE_3.30.1 clusterProfiler_4.12.0
loaded via a namespace (and not attached):
[1] DBI_1.2.2 shadowtext_0.1.3 gson_0.1.0
[4] gridExtra_2.3 rlang_1.1.3 magrittr_2.0.3
[7] ggridges_0.5.6 compiler_4.4.0 RSQLite_2.3.6
[10] png_0.1-8 vctrs_0.6.5 reshape2_1.4.4
[13] stringr_1.5.1 pkgconfig_2.0.3 crayon_1.5.2
[16] fastmap_1.2.0 XVector_0.44.0 labeling_0.4.3
[19] ggraph_2.2.1 utf8_1.2.4 HDO.db_0.99.1
[22] UCSC.utils_1.0.0 purrr_1.0.2 bit_4.0.5
[25] zlibbioc_1.50.0 cachem_1.1.0 aplot_0.2.2
[28] GenomeInfoDb_1.40.0 jsonlite_1.8.8 blob_1.2.4
[31] BiocParallel_1.38.0 tweenr_2.0.3 parallel_4.4.0
[34] R6_2.5.1 stringi_1.8.4 RColorBrewer_1.1-3
[37] GOSemSim_2.30.0 Rcpp_1.0.12 snow_0.4-4
[40] IRanges_2.38.0 Matrix_1.7-0 splines_4.4.0
[43] igraph_2.0.3 tidyselect_1.2.1 qvalue_2.36.0
[46] viridis_0.6.5 codetools_0.2-20 lattice_0.22-6
[49] tibble_3.2.1 plyr_1.8.9 Biobase_2.64.0
[52] treeio_1.28.0 withr_3.0.0 KEGGREST_1.44.0
[55] gridGraphics_0.5-1 scatterpie_0.2.2 polyclip_1.10-6
[58] Biostrings_2.72.0 pillar_1.9.0 ggtree_3.12.0
[61] stats4_4.4.0 ggfun_0.1.4 generics_0.1.3
[64] S4Vectors_0.42.0 ggplot2_3.5.1 munsell_0.5.1
[67] scales_1.3.0 tidytree_0.4.6 glue_1.7.0
[70] lazyeval_0.2.2 tools_4.4.0 data.table_1.15.4
[73] fgsea_1.30.0 fs_1.6.4 graphlayouts_1.1.1
[76] fastmatch_1.1-4 tidygraph_1.3.1 cowplot_1.1.3
[79] grid_4.4.0 tidyr_1.3.1 ape_5.8
[82] AnnotationDbi_1.66.0 colorspace_2.1-0 nlme_3.1-164
[85] GenomeInfoDbData_1.2.12 patchwork_1.2.0 ggforce_0.4.2
[88] cli_3.6.2 fansi_1.0.6 viridisLite_0.4.2
[91] dplyr_1.1.4 gtable_0.3.5 yulab.utils_0.1.4
[94] digest_0.6.35 BiocGenerics_0.50.0 ggrepel_0.9.5
[97] ggplotify_0.1.2 farver_2.1.2 memoise_2.0.1
[100] lifecycle_1.0.4 httr_1.4.7 GO.db_3.19.1
[103] bit64_4.0.5 MASS_7.3-60.2
>
>
It works, perhaps it was due to working with the old version github package
I use ‘clusterProfiler’ version 4.5.1.902 to do gse analysis, I got gse result and related dotplot, however, I cannot generate ridgeplot.
gse <- gseGO(geneList=gene_list, ont ="ALL", keyType = "SYMBOL", nPerm = 10000, minGSSize = 3, maxGSSize = 800, pvalueCutoff = 0.05, verbose = TRUE, OrgDb = "org.Mm.eg.db", pAdjustMethod = "none")
dotplot(gse, showCategory=10, split=".sign") + facet_grid(.~.sign)
ridgeplot(gse)
Error in ans[ypos] <- rep(yes, length.out = len)[ypos] : replacement has length zero