YuLab-SMU / clusterProfiler

:bar_chart: A universal enrichment tool for interpreting omics data
https://yulab-smu.top/biomedical-knowledge-mining-book/
1.01k stars 255 forks source link

Dotplot to show only selected pathways from compareCluster result #735

Closed asmariyaz23 closed 2 days ago

asmariyaz23 commented 3 days ago

I have used compareCluster to compute enriched pathways, and using dotplot function I want to plot only those pathways that have "T cell" substring in the description. Is there a way to do that? I know that the showCategory can be used to select pathways, but how do I accomplish that for s4 object result from compareCluster?

guidohooiveld commented 2 days ago

The easiest way of doing this is IMHO to 1) generate the regular dotplot and store this in an object, 2) extract the pathways of interest, and 3) regenerate the dotplot with only the pathways of interest (i.e. those pathways containing your substring of interest).

By doing so, only the significant pathways that contain your substring of interest are included in the (regenerated) dotplot. Note that that the object generated at 1) is a ggplot2-object that contains all information of the plot. Some code to illustrate the idea:

> library(clusterProfiler)
> 
> ## use the example data to make a compareCluster result
> data(gcSample)
> xx <- compareCluster(gcSample, fun="enrichKEGG",
+                      organism="hsa", pvalueCutoff=0.05)
> 
> ## show as dotplot.
> ## note 1: in this case results of 20 pathways that are significantly enriched are presented in the plot
> ## note 2: all results are stored in object p1. 
> p1 <- dotplot(xx)
> print(p1)
> 

image

> ## check the data-slot of p1
> ## this data-slot contains all plotted info
> p1$data 
     Cluster                             category
1  X2\n(410)                       Human Diseases
2  X2\n(410)                       Human Diseases
3  X2\n(410)                   Cellular Processes
4  X3\n(195) Environmental Information Processing
5  X3\n(195) Environmental Information Processing
6  X3\n(195) Environmental Information Processing
7  X4\n(440)                       Human Diseases
8  X4\n(440)       Genetic Information Processing
9  X4\n(440)                   Cellular Processes
10 X4\n(440) Environmental Information Processing
11 X4\n(440)                       Human Diseases
16 X4\n(440)       Genetic Information Processing
28 X5\n(481)                   Organismal Systems
29 X5\n(481)                       Human Diseases
30 X5\n(481)                   Organismal Systems
31 X5\n(481)       Genetic Information Processing
32 X5\n(481)                       Human Diseases
33 X5\n(481) Environmental Information Processing
37 X6\n(286)                   Cellular Processes
38 X7\n(327)                       Human Diseases
39 X7\n(327)                       Human Diseases
40 X7\n(327)                       Human Diseases
41 X7\n(327)                       Human Diseases
42 X7\n(327)       Genetic Information Processing
43 X7\n(327)                       Human Diseases
56 X8\n(162)                       Human Diseases
57 X8\n(162)                       Human Diseases
58 X8\n(162)                       Human Diseases
59 X8\n(162)                       Human Diseases
60 X8\n(162)                       Human Diseases
67 X8\n(162)       Genetic Information Processing
                           subcategory       ID
1            Infectious disease: viral hsa05169
2                       Immune disease hsa05340
3                Cell growth and death hsa04110
4  Signaling molecules and interaction hsa04512
5  Signaling molecules and interaction hsa04060
6                  Signal transduction hsa04151
7               Cancer: specific types hsa05215
8               Replication and repair hsa03030
9                Cell growth and death hsa04110
10                 Signal transduction hsa04068
11     Drug resistance: antineoplastic hsa01521
16    Folding, sorting and degradation hsa03050
28                               Aging hsa04211
29     Endocrine and metabolic disease hsa04933
30                               Aging hsa04213
31              Replication and repair hsa03030
32     Drug resistance: antineoplastic hsa01524
33                 Signal transduction hsa04068
37               Cell growth and death hsa04110
38           Neurodegenerative disease hsa05020
39           Neurodegenerative disease hsa05012
40           Neurodegenerative disease hsa05016
41           Neurodegenerative disease hsa05010
42    Folding, sorting and degradation hsa03050
43           Neurodegenerative disease hsa05014
56           Neurodegenerative disease hsa05012
57           Neurodegenerative disease hsa05010
58           Neurodegenerative disease hsa05020
59           Neurodegenerative disease hsa05016
60           Neurodegenerative disease hsa05014
67    Folding, sorting and degradation hsa03050
                                            Description  GeneRatio  BgRatio
1                          Epstein-Barr virus infection 0.05609756 203/8850
2                              Primary immunodeficiency 0.01951220  38/8850
3                                            Cell cycle 0.04390244 158/8850
4                              ECM-receptor interaction 0.04615385  89/8850
5                Cytokine-cytokine receptor interaction 0.08717949 298/8850
6                            PI3K-Akt signaling pathway 0.09743590 362/8850
7                                       Prostate cancer 0.03863636  98/8850
8                                       DNA replication 0.02272727  36/8850
9                                            Cell cycle 0.05000000 158/8850
10                               FoxO signaling pathway 0.04318182 133/8850
11            EGFR tyrosine kinase inhibitor resistance 0.03181818  80/8850
16                                           Proteasome 0.02045455  46/8850
28                         Longevity regulating pathway 0.03118503  90/8850
29 AGE-RAGE signaling pathway in diabetic complications 0.03118503 101/8850
30      Longevity regulating pathway - multiple species 0.02286902  62/8850
31                                      DNA replication 0.01663202  36/8850
32                             Platinum drug resistance 0.02494802  75/8850
33                               FoxO signaling pathway 0.03534304 133/8850
37                                           Cell cycle 0.06643357 158/8850
38                                        Prion disease 0.09480122 278/8850
39                                    Parkinson disease 0.09174312 271/8850
40                                   Huntington disease 0.09785933 311/8850
41                                    Alzheimer disease 0.11009174 391/8850
42                                           Proteasome 0.03363914  46/8850
43                        Amyotrophic lateral sclerosis 0.10397554 371/8850
56                                    Parkinson disease 0.17901235 271/8850
57                                    Alzheimer disease 0.19753086 391/8850
58                                        Prion disease 0.16666667 278/8850
59                                   Huntington disease 0.16666667 311/8850
60                        Amyotrophic lateral sclerosis 0.17283951 371/8850
67                                           Proteasome 0.04320988  46/8850
   RichFactor FoldEnrichment    zScore       pvalue     p.adjust       qvalue
1  0.11330049       2.445633  4.592414 6.196364e-05 1.958051e-02 1.826297e-02
2  0.21052632       4.544288  4.825600 2.843178e-04 3.799916e-02 3.544226e-02
3  0.11392405       2.459092  4.078676 3.607516e-04 3.799916e-02 3.544226e-02
4  0.10112360       4.589455  5.108314 1.413151e-04 3.553325e-02 3.254232e-02
5  0.05704698       2.589055  4.188361 3.022218e-04 3.553325e-02 3.254232e-02
6  0.05248619       2.382065  4.030056 3.848366e-04 3.553325e-02 3.254232e-02
7  0.17346939       3.489100  5.667313 5.590342e-06 1.015143e-03 8.326531e-04
8  0.27777778       5.587121  6.307836 6.591837e-06 1.015143e-03 8.326531e-04
9  0.13924051       2.800633  5.223593 1.045311e-05 1.073186e-03 8.802618e-04
10 0.14285714       2.873377  4.979008 2.902699e-05 2.084494e-03 1.709770e-03
11 0.17500000       3.519886  5.178472 3.383919e-05 2.084494e-03 1.709770e-03
16 0.19565217       3.935277  4.565238 3.639596e-04 1.120995e-02 9.194768e-03
28 0.16666667       3.066528  4.723820 9.008700e-05 2.855758e-02 2.503470e-02
29 0.14851485       2.732550  4.198074 3.374443e-04 4.035679e-02 3.537836e-02
30 0.17741935       3.264369  4.289244 4.443583e-04 4.035679e-02 3.537836e-02
31 0.22222222       4.088704  4.451679 5.632476e-04 4.035679e-02 3.537836e-02
32 0.16000000       2.943867  4.052812 6.597502e-04 4.035679e-02 3.537836e-02
33 0.12781955       2.351773  3.765555 8.367405e-04 4.035679e-02 3.537836e-02
37 0.12025316       3.721121  6.306789 7.241045e-07 2.157831e-04 2.149447e-04
38 0.11151079       3.017953  6.696024 3.158778e-08 9.693736e-06 8.580580e-06
39 0.11070111       2.996039  6.536723 6.294634e-08 9.693736e-06 8.580580e-06
40 0.10289389       2.784743  6.275915 1.261186e-07 1.294818e-05 1.146130e-05
41 0.09207161       2.491846  5.909867 3.274592e-07 2.521436e-05 2.231893e-05
42 0.23913043       6.471879  7.287861 6.201380e-07 3.820050e-05 3.381384e-05
43 0.09164420       2.480279  5.705358 7.938800e-07 4.075251e-05 3.607279e-05
56 0.10701107       5.845975 11.063491 7.084817e-15 1.643677e-12 1.424421e-12
57 0.08184143       4.470967  9.585705 4.788449e-13 5.554601e-11 4.813652e-11
58 0.09712230       5.305755  9.960368 7.190237e-13 5.560450e-11 4.818720e-11
59 0.08681672       4.742765  9.175167 1.037262e-11 6.016121e-10 5.213608e-10
60 0.07547170       4.122991  8.391316 1.157900e-10 5.372658e-09 4.655979e-09
67 0.15217391       8.313205  6.790337 1.776333e-05 3.434244e-04 2.976137e-04
                                                                                                                                                                                  geneID
1                                                                              4067/3383/7128/1869/890/1871/578/864/637/9641/6891/355/9134/5971/916/956/6850/7187/3551/919/4734/958/6772
2                                                                                                                                                    100/6891/3932/973/916/925/958/64421
3                                                                                                991/1869/890/1871/701/990/10926/9088/8317/9700/9134/1029/2810/699/11200/23594/8555/4173
4                                                                                                                                           7057/3339/1299/3695/1101/3679/3910/3696/3693
5                                                                                                   2919/4982/3977/6375/8200/608/8792/3568/2057/1438/8718/655/652/10220/50615/51561/7042
6                                                                                          894/7057/6794/2247/1299/3695/2252/2066/1101/8817/1021/5105/3679/3082/2057/3910/3551/3696/3693
7                                                                                                  2950/1387/5159/5604/5156/596/4318/3551/367/2260/5595/5295/10000/6935/6655/90993/80310
8                                                                                                                                    5425/4172/4175/4171/10535/5984/2237/4176/54107/4173
9                                                                           6500/9184/4172/994/4175/4171/1387/10274/8697/902/4616/23047/5591/4176/8881/7043/983/1022/23063/1028/891/4173
10                                                                                       7874/5571/10769/1387/5604/901/5106/4616/8660/3551/7043/5595/5295/10000/6655/3643/891/10365/6648
11                                                                                                                 5159/5604/558/5156/596/9470/5595/7422/5295/10000/4763/6655/80310/1978
16                                                                                                                                        5701/5704/11047/5713/5718/5699/5698/23198/5688
28                                                                                                             5564/2308/3480/5293/1385/115/3667/9370/7248/3479/9474/10000/107/6648/7249
29                                                                                                            3383/2308/6776/7412/5293/7056/5336/185/6772/5603/4772/6777/10000/6347/4088
30                                                                                                                                 5564/2308/3480/5293/115/3667/3479/9474/10000/107/6648
31                                                                                                                                            5982/57804/5983/2237/5557/4173/4174/246243
32                                                                                                                             7153/332/2952/5293/1317/842/355/7507/4436/1029/2948/10000
33                                                                                                 5564/664/9133/2308/6502/3480/5293/1017/9454/3667/3479/5603/10000/6654/6648/4088/92579
37                                                                                       4174/5347/990/10403/23047/7272/4999/1869/995/1111/4088/9212/81620/1031/9134/4173/1028/898/26271
38                          5690/3915/5714/5708/1537/1965/5686/5621/4698/5683/5684/4700/10381/7979/5532/4729/5685/4726/4723/5717/5693/4704/4695/4724/1329/857/10213/1350/4705/4708/54539
39                              5690/7332/5714/5708/1843/1537/1965/5686/4698/5683/5684/4700/10381/7979/4729/5685/4726/4723/5717/5693/4704/4695/4724/1329/10213/1350/2778/4705/4708/54539
40                     5690/163/5714/5708/1537/5686/4698/5683/5684/3066/4700/10381/7979/2776/5440/4729/5685/4726/1175/4723/5717/5693/8678/4704/4695/4724/1329/10213/1350/4705/4708/54539
41 5690/5714/5708/1537/1965/5686/4698/5683/5684/4700/10381/8883/7979/3028/5605/5532/2776/4729/348/5685/4726/1452/4723/5717/5693/8678/4704/8660/4695/4724/1329/10213/1350/4705/4708/54539
42                                                                                                                               5690/5714/5708/5686/5683/5684/7979/5685/5717/5693/10213
43          5690/5714/5708/1537/1965/5686/4698/5683/5684/4700/10381/7979/5532/6428/4729/5685/80208/4726/311/6396/7415/4723/5717/5693/8678/4704/4695/4724/1329/10213/1350/4705/4708/54539
56                                   1327/292/7494/468/5695/5701/10376/5688/1351/1350/506/5691/27089/25800/1349/7388/5692/9377/509/518/517/5687/7295/54205/7386/10383/203068/84790/29796
57                     1327/292/7494/468/5695/5701/10376/5688/1351/1350/506/1460/5691/27089/25800/1349/7388/5692/9377/509/518/517/5687/1452/54205/7386/10383/203068/488/84790/2597/29796
58                                              1327/292/468/5695/5701/10376/5688/1351/1350/506/1460/5691/27089/1349/7388/5692/9377/509/518/517/5687/54205/7386/10383/203068/84790/29796
59                                             1327/292/5695/1211/5701/10376/5688/1351/1350/506/5691/27089/1349/7388/5692/9377/509/518/517/5687/54205/7386/10383/203068/84790/1175/29796
60                                        1327/9782/7494/468/5695/5701/10376/5688/1351/1350/506/5691/27089/1349/7388/5692/9377/509/518/517/5687/54205/7386/10383/203068/84790/6432/29796
67                                                                                                                                                    5695/5701/5688/5691/5721/5692/5687
   Count
1     23
2      8
3     18
4      9
5     17
6     19
7     17
8     10
9     22
10    19
11    14
16     9
28    15
29    15
30    11
31     8
32    12
33    17
37    19
38    31
39    30
40    32
41    36
42    11
43    34
56    29
57    32
58    27
59    27
60    28
67     7
> 
> ## using grepl extract pathways that match substring of interest.
> ## in this case show only pathways that contain 'disease' in their description
> geneSets2plot <- unique( p1$data$Description[ grepl("*disease*", p1$data$Description) ] )
> 
> geneSets2plot
[1] Prion disease      Parkinson disease  Huntington disease Alzheimer disease 
20 Levels: Amyotrophic lateral sclerosis ... Epstein-Barr virus infection
> 
> 
> ## regenerate dotplot with only pathways of interest (i.e. those with 'disease' in description
> p2 <- dotplot(xx, showCategory=geneSets2plot)
> 
> print(p2)
> 
>

image

asmariyaz23 commented 2 days ago

Hello,

I used compareCluster:

GO_MF_up<-compareCluster(geneClusters=sig_genes_up,fun="enrichGO",OrgDb=org.Mm.eg.db,ont="MF",keyType="SYMBOL")

and then used dotplot this way:

    p1<-dotplot(GO_MF_up,showCategory=10,font.size=10,title="GO MF enrichment, genes up")
    geneSets2plot <- unique( p1$data$Description[ grepl("*T cell*", p1$data$Description) ] )
    p2 <- dotplot(GO_MF_up, showCategory=geneSets2plot)
    p2

But I get the error below:

Error: unable to find an inherited method for function ‘dotplot’ for signature ‘object = "enrichplotDot"’

Should I used a different "fun" in compareCluster for this to work?

Thank you, Asma

guidohooiveld commented 2 days ago

It could be that the function dotplot is masked by another package.

What happens if you explicitly call the function from enrichplot? Thus: enrichplot::dotplot(GO_MF_up,showCategory=10,font.size=10,title="GO MF enrichment, genes up").

Also, does the content of geneSets2plot match with what you anticipate?

asmariyaz23 commented 2 days ago
    p1<-dotplot(GO_MF_up,showCategory=10,font.size=10,title="GO MF enrichment, genes up")
    geneSets2plot <- unique( p1$data$Description[ grepl("*T cell*", p1$data$Description) ] )
    p2 <- enrichplot::dotplot(GO_MF_up, showCategory=geneSets2plot)
    p2

the error I get now is:

Error in if (inherits(result$GeneRatio, "character") && grep("/", result$GeneRatio[1])) { : 
  missing value where TRUE/FALSE needed
asmariyaz23 commented 2 days ago

Update: @guidohooiveld solution helped. The MF criteria just didn't yield any T cell related pathways, hence the issues I reported later.