saezlab / decoupleR

R package to infer biological activities from omics data using a collection of methods.
https://saezlab.github.io/decoupleR/
GNU General Public License v3.0
190 stars 24 forks source link

get_collectri orgaism="mouse" return "human" #87

Closed Pazuzzilla closed 1 year ago

Pazuzzilla commented 1 year ago

Hi,

I'm trying to run the Transcription factor activity inference from scRNA-seq on my dataset. When trying to get the network with:

net <- get_collectri(organism="mouse", split_complexes=FALSE)

i get the same network as if "human" was select as organism:

#> # A tibble: 43,178 × 3
#>    source target   mor
#>    <chr>  <chr>  <dbl>
#>  1 MYC    TERT       1
#>  2 SPI1   BGLAP      1
#>  3 SMAD3  JUN        1
#>  4 SMAD4  JUN        1
#>  5 STAT5A IL2        1
#>  6 STAT5B IL2        1
#>  7 RELA   FAS        1
#>  8 WT1    NR0B1      1
#>  9 NR0B2  CASP1     -1
#> 10 SP1    ALDOA      1

Is it correct? Maybe i've misunderstood something, I think i have everything up to date. My sessionInfo below. Thanks

R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 20.3

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Rome
tzcode source: system (glibc)

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils    
[7] datasets  methods   base     

other attached packages:
 [1] OmnipathR_3.9.6             remotes_2.4.2              
 [3] decoupleR_2.5.2             dorothea_1.12.0            
 [5] cowplot_1.1.1               scCustomize_1.1.1          
 [7] superheat_0.1.0             ComplexHeatmap_2.16.0      
 [9] lemon_0.4.6                 pcaExplorer_2.26.1         
[11] org.Hs.eg.db_3.17.0         fgsea_1.26.0               
[13] ggsci_3.0.0                 lubridate_1.9.2            
[15] forcats_1.0.0               purrr_1.0.1                
[17] tibble_3.2.1                tidyverse_2.0.0            
[19] readr_2.1.4                 edgeR_3.42.2               
[21] rjson_0.2.21                GenomicFeatures_1.52.0     
[23] AnnotationDbi_1.62.1        tximport_1.28.0            
[25] EnhancedVolcano_1.18.0      ggrepel_0.9.3              
[27] ggalluvial_0.12.5           reshape2_1.4.4             
[29] msigdbr_7.5.1               stringr_1.5.0              
[31] pheatmap_1.0.12             RColorBrewer_1.1-3         
[33] DESeq2_1.40.1               limma_3.56.1               
[35] knitr_1.42                  infercnv_1.16.0            
[37] clustree_0.5.0              ggraph_2.1.0               
[39] scuttle_1.10.1              panc8.SeuratData_3.0.2     
[41] ifnb.SeuratData_3.1.0       SeuratData_0.2.2           
[43] copykat_1.0.5               FSA_0.9.4                  
[45] ggpubr_0.6.0                SeuratWrappers_0.3.0       
[47] monocle3_1.0.0              GSA_1.03.2                 
[49] AUCell_1.22.0               GSVA_1.48.0                
[51] WriteXLS_6.4.0              data.table_1.14.8          
[53] tidyr_1.3.0                 plyr_1.8.8                 
[55] SingleR_2.2.0               celldex_1.10.0             
[57] FNN_1.1.3.2                 class_7.3-22               
[59] harmony_0.1.1               Rcpp_1.0.10                
[61] readxl_1.4.2                R.utils_2.12.2             
[63] R.oo_1.25.0                 R.methodsS3_1.8.2          
[65] DropletUtils_1.20.0         SingleCellExperiment_1.22.0
[67] SummarizedExperiment_1.30.1 GenomicRanges_1.52.0       
[69] GenomeInfoDb_1.36.0         IRanges_2.34.0             
[71] S4Vectors_0.38.1            MatrixGenerics_1.12.0      
[73] matrixStats_0.63.0          SeuratDisk_0.0.0.9019      
[75] Seurat_4.3.0                reticulate_1.28            
[77] biomaRt_2.56.0              patchwork_1.1.2            
[79] CellChat_1.6.1              Biobase_2.60.0             
[81] BiocGenerics_0.46.0         ggplot2_3.4.2              
[83] igraph_1.4.2                dplyr_1.1.2                
[85] SeuratObject_4.1.3          sp_1.6-0                   

loaded via a namespace (and not attached):
  [1] DBI_1.1.3                     httr_1.4.6                   
  [3] ggh4x_0.2.4                   registry_0.5-1               
  [5] BiocParallel_1.34.1           prettyunits_1.1.1            
  [7] GenomicAlignments_1.36.0      sparseMatrixStats_1.12.0     
  [9] spatstat.geom_3.2-1           babelgene_22.9               
 [11] survMisc_0.5.6                pillar_1.9.0                 
 [13] Rgraphviz_2.44.0              R6_2.5.1                     
 [15] mime_0.12                     uwot_0.1.14                  
 [17] Category_2.66.0               viridis_0.6.3                
 [19] genefilter_1.82.1             Rhdf5lib_1.22.0              
 [21] libcoin_1.0-9                 ROCR_1.0-11                  
 [23] Hmisc_5.1-0                   rprojroot_2.0.3              
 [25] KMsurv_0.1-5                  parallelly_1.35.0            
 [27] GlobalOptions_0.1.2           caTools_1.18.2               
 [29] mgcv_1.8-42                   polyclip_1.10-4              
 [31] NMF_0.26                      beachmat_2.16.0              
 [33] htmltools_0.5.5               fansi_1.0.4                  
 [35] lambda.r_1.2.4                car_3.1-2                    
 [37] snakecase_0.11.0              spatstat.utils_3.0-3         
 [39] survminer_0.4.9               rpart_4.1.19                 
 [41] clue_0.3-64                   fitdistrplus_1.1-11          
 [43] goftest_1.2-3                 tidyselect_1.2.0             
 [45] RSQLite_2.3.1                 GenomeInfoDbData_1.2.10      
 [47] utf8_1.2.3                    ScaledMatrix_1.8.1           
 [49] scattermore_1.0               rvest_1.0.3                  
 [51] spatstat.data_3.0-1           gridExtra_2.3                
 [53] sctransform_0.3.5             future.apply_1.10.0          
 [55] graph_1.78.0                  topGO_2.52.0                 
 [57] vipor_0.4.5                   rtracklayer_1.60.0           
 [59] Rtsne_0.16                    DelayedMatrixStats_1.22.0    
 [61] lazyeval_0.2.2                scales_1.2.1                 
 [63] carData_3.0-5                 munsell_0.5.0                
 [65] bitops_1.0-7                  seriation_1.4.2              
 [67] labeling_0.4.2                KEGGREST_1.40.0              
 [69] promises_1.2.0.1              shape_1.4.6                  
 [71] rhdf5filters_1.12.1           zoo_1.8-12                   
 [73] princurve_2.1.6               locfit_1.5-9.7               
 [75] DelayedArray_0.26.2           RSpectra_0.16-1              
 [77] multcomp_1.4-23               assertthat_0.2.1             
 [79] paletteer_1.5.0               tools_4.3.0                  
 [81] ape_5.7-1                     processx_3.8.1               
 [83] shiny_1.7.4                   BiocFileCache_2.8.0          
 [85] GOstats_2.66.0                rlang_1.1.1                  
 [87] generics_0.1.3                BiocSingular_1.16.0          
 [89] ggridges_0.5.4                evaluate_0.21                
 [91] fastcluster_1.2.3             siggenes_1.74.0              
 [93] TrajectoryUtils_1.8.0         BiocIO_1.10.0                
 [95] UCell_2.4.0                   bcellViper_1.36.0            
 [97] colorspace_2.1-0              RBGL_1.76.0                  
 [99] ellipsis_0.3.2                withr_2.5.0                  
[101] bioc2020trajectories_0.0.0.93 shinyBS_0.61.1               
[103] RCurl_1.98-1.12               futile.logger_1.4.3          
[105] restfulr_0.0.15               xtable_1.8-4                 
[107] MatrixModels_0.5-1            systemfonts_1.0.4            
[109] httpuv_1.6.11                 rmarkdown_2.21               
[111] MASS_7.3-60                   dqrng_0.3.0                  
[113] broom_1.0.4                   deldir_1.0-6                 
[115] GO.db_3.17.0                  sandwich_3.0-2               
[117] rhdf5_2.44.0                  tensor_1.5                   
[119] vctrs_0.6.2                   lifecycle_1.0.3              
[121] logger_0.2.2                  codetools_0.2-19             
[123] DT_0.27                       here_1.0.1                   
[125] nlme_3.1-162                  future_1.32.0                
[127] progress_1.2.2                dbplyr_2.3.2                 
[129] cellranger_1.1.0              shinydashboard_0.7.2         
[131] rstudioapi_0.14               stringi_1.7.12               
[133] heatmaply_1.4.2               hms_1.1.3                    
[135] pbapply_1.7-0                 cachem_1.0.8                 
[137] multtest_2.56.0               BiocManager_1.30.20          
[139] hdf5r_1.3.8                   listenv_0.9.0                
[141] XVector_0.40.0                ggrastr_1.0.1                
[143] plotly_4.10.1                 ExperimentHub_2.8.0          
[145] pkgbuild_1.4.0                GetoptLong_1.0.5             
[147] HDF5Array_1.28.1              htmlwidgets_1.6.2            
[149] Formula_1.2-5                 interactiveDisplayBase_1.38.0
[151] dendextend_1.17.1             memoise_2.0.1                
[153] crayon_1.5.2                  rappdirs_0.3.3               
[155] S4Arrays_1.0.4                xml2_1.3.4                   
[157] filelock_1.0.2                png_0.1-8                    
[159] progressr_0.13.0              tzdb_0.4.0                   
[161] threejs_0.3.3                 fastmap_1.1.1                
[163] GSEABase_1.62.0               coda_0.19-4                  
[165] tidygraph_1.2.3               pkgconfig_2.0.3              
[167] cli_3.6.1                     beeswarm_0.4.0               
[169] ggforce_0.4.1                 ps_1.7.5                     
[171] ggsignif_0.6.4                nnet_7.3-19                  
[173] gridBase_0.4-7                lmtest_0.9-40                
[175] BiocVersion_3.17.1            RcppAnnoy_0.0.20             
[177] argparse_2.2.2                timechange_0.2.0             
[179] shinyAce_0.4.2                viridisLite_0.4.2            
[181] rjags_4-14                    foreign_0.8-84               
[183] splines_4.3.0                 blob_1.2.4                   
[185] annotate_1.78.0               XML_3.99-0.14                
[187] network_1.18.1                globals_0.16.2               
[189] ggbeeswarm_0.7.2              ggprism_1.0.4                
[191] AnnotationForge_1.42.0        ica_1.0-3                    
[193] compiler_4.3.0                janitor_2.2.0                
[195] RcppParallel_5.1.7            bit_4.0.5                    
[197] slingshot_2.8.0               AnnotationHub_3.8.0          
[199] ggpp_0.5.2                    BiocNeighbors_1.18.0         
[201] glue_1.6.2                    formatR_1.14                 
[203] ggnetwork_0.5.12              digest_0.6.31                
[205] irlba_2.3.5.1                 leiden_0.4.3                 
[207] graphlayouts_1.0.0            foreach_1.5.2                
[209] vroom_1.6.3                   spatstat.random_3.1-5        
[211] SparseM_1.81                  zlibbioc_1.46.0              
[213] tweenr_2.0.2                  lattice_0.21-8               
[215] rsvd_1.0.5                    mvtnorm_1.1-3                
[217] yaml_2.3.7                    later_1.3.1                  
[219] modeltools_0.2-23             statnet.common_4.8.0         
[221] backports_1.4.1               rstatix_0.7.2                
[223] Rsamtools_2.16.0              parallel_4.3.0               
[225] rematch2_2.1.2                sna_2.7-1                    
[227] parallelDist_0.2.6            quantreg_5.95                
[229] miniUI_0.1.1.1                gtable_0.3.3                 
[231] abind_1.4-5                   xfun_0.39                    
[233] Cairo_1.6-0                   Biostrings_2.68.0            
[235] crosstalk_1.2.0               webshot_0.5.4                
[237] curl_5.0.0                    callr_3.7.3                  
[239] doParallel_1.0.17             KernSmooth_2.23-21           
[241] futile.options_1.0.1          survival_3.5-5               
[243] desc_1.4.2                    jsonlite_1.8.4               
[245] magrittr_2.0.3                coin_1.4-2                   
[247] svglite_2.1.1                 base64enc_0.1-3              
[249] scrime_1.3.5                  iterators_1.0.14             
[251] TH.data_1.1-2                 Matrix_1.5-4                 
[253] km.ci_0.5-6                   fastmatch_1.1-3              
[255] ggpmisc_0.5.2                 checkmate_2.2.0              
[257] gtools_3.9.4                  htmlTable_2.4.1              
[259] spatstat.sparse_3.0-1         rngtools_1.5.2               
[261] RANN_2.6.1                    writexl_1.4.2                
[263] phyclust_0.1-33               circlize_0.4.15              
[265] spatstat.explore_3.2-1        polynom_1.4-1                
[267] bit64_4.0.5                   TSP_1.2-4                    
[269] cluster_2.1.4                 ca_0.71.1                    
[271] farver_2.1.1                  gplots_3.1.3   `
PauBadiaM commented 1 year ago

Huh that's weird, you seem to have the right versions for OmniPathR and decoupleR. I tried and I do get the correct network:

df <- decoupleR::get_collectri(organism='mouse', split_complexes=FALSE)
df
[2023-06-23 15:19:25] [SUCCESS] [OmnipathR] Downloaded 38823 interactions.
# A tibble: 38,665 × 3
   source target   mor
   <chr>  <chr>  <dbl>
 1 Myc    Tert       1
 2 Spi1   Bglap2     1
 3 Spi1   Bglap      1
 4 Spi1   Bglap3     1
 5 Smad3  Jun        1
 6 Smad4  Jun        1
 7 Stat5a Il2        1
 8 Stat5b Il2        1
 9 Rela   Fas        1
10 Wt1    Nr0b1      1
# … with 38,655 more rows
# ℹ Use `print(n = ...)` to see more rows

Maybe it's the cache? @deeenes , what is your opinion? What was the command to reset the cache in the R version of omnipath? Just in case I would try to reinstall from github, reset your R session and try again:

remotes::install_github('saezlab/omnipathr')
remotes::install_github('saezlab/decoupleR')
deeenes commented 1 year ago

Just like Pau, I experience the correct behaviour. I don't see anything wrong in the code, and also I can't imagine caching can cause this issue, though it can't harm to try with an empty cache. In this procedure we get mouse records from the server, the translation from human to mouse doesn't happen in the client side but during the database build. In the log we can see the organisms=10090 parameter:

library(decoupleR)
library(OmnipathR)
omnipath_set_console_loglevel('trace')
ci <- get_collectri('mouse', split_complexes = FALSE)
[2023-06-23 15:33:04] [INFO]    [OmnipathR] Cache record does not exist: `https://omnipathdb.org/interactions?genesymbols=yes&datasets=collectri&organisms=10090&dorothea_levels=A,B&fields=evidences,sources,references,curation_effort&loops=yes&license=academic`
[2023-06-23 15:33:04] [INFO]    [OmnipathR] Retrieving URL: `https://omnipathdb.org/interactions?genesymbols=yes&datasets=collectri&organisms=10090&dorothea_levels=A,B&fields=evidences,sources,references,curation_effort&loops=yes&license=academic`
[2023-06-23 15:33:04] [TRACE]   [OmnipathR] Attempt 1/3: `https://omnipathdb.org/interactions?genesymbols=yes&datasets=collectri&organisms=10090&dorothea_levels=A,B&fields=evidences,sources,references,curation_effort&loops=yes&license=academic`
[2023-06-23 15:33:05] [TRACE]   [OmnipathR] Reading JSON from `/home/denes/.cache/OmnipathR/cache.json` (encoding: UTF-8).
[2023-06-23 15:33:05] [TRACE]   [OmnipathR] JSON validation successful: TRUE
[2023-06-23 15:33:05] [TRACE]   [OmnipathR] Reading JSON from `/home/denes/.cache/OmnipathR/cache.json` (encoding: UTF-8).
[2023-06-23 15:33:05] [TRACE]   [OmnipathR] JSON validation successful: TRUE
[2023-06-23 15:33:05] [INFO]    [OmnipathR] Cache item `107896962c7d5fc50d4cbc51809a591cd50bb105` version 1: status changed from `unknown` to `started`.
[2023-06-23 15:33:05] [TRACE]   [OmnipathR] Exporting object to RDS: `/home/denes/.cache/OmnipathR/107896962c7d5fc50d4cbc51809a591cd50bb105-1.rds`.
[2023-06-23 15:33:06] [TRACE]   [OmnipathR] Exported RDS to `/home/denes/.cache/OmnipathR/107896962c7d5fc50d4cbc51809a591cd50bb105-1.rds`.
[2023-06-23 15:33:06] [INFO]    [OmnipathR] Download ready [key=107896962c7d5fc50d4cbc51809a591cd50bb105, version=1]
[2023-06-23 15:33:06] [TRACE]   [OmnipathR] Reading JSON from `/home/denes/.cache/OmnipathR/cache.json` (encoding: UTF-8).
[2023-06-23 15:33:06] [TRACE]   [OmnipathR] JSON validation successful: TRUE
[2023-06-23 15:33:06] [INFO]    [OmnipathR] Cache item `107896962c7d5fc50d4cbc51809a591cd50bb105` version 1: status changed from `started` to `ready`.
[2023-06-23 15:33:06] [TRACE]   [OmnipathR] Converting JSON column `evidences` to list.
[2023-06-23 15:33:08] [TRACE]   [OmnipathR] Restricting interaction records to datasets: collectri; and resources: any
[2023-06-23 15:33:10] [TRACE]   [OmnipathR] Filtering evidence columns: positive, negative, directed, undirected; to datasets: collectri; and resources: any
[2023-06-23 15:34:10] [SUCCESS] [OmnipathR] Downloaded 38823 interactions.

ci
# A tibble: 38,665 × 3
   source target   mor
   <chr>  <chr>  <dbl>
 1 Myc    Tert       1
 2 Spi1   Bglap2     1
 3 Spi1   Bglap      1
 4 Spi1   Bglap3     1
 5 Smad3  Jun        1
 6 Smad4  Jun        1
 7 Stat5a Il2        1
 8 Stat5b Il2        1
 9 Rela   Fas        1
10 Wt1    Nr0b1      1
# ℹ 38,655 more rows
# ℹ Use `print(n = ...)` to see more rows

And how to empty the cache:

library(OmnipathR)
omnipath_cache_wipe()

Please keep us updated, as this issue is kind of mysterious, I can't guess the reason right away. If you could share the trace level log (similar to the one I pasted above) that might give a clue.

Pazuzzilla commented 1 year ago

Hi,

just an update, reset the R session worked fine as solution, unfortunately i tried it before reading your advice so i can't produce the trace of when the problem was here. Sorry, if i'm going to face the problem again i will give you more informations about it.

Thanks for your job.

ChrisTzaferis commented 1 year ago

Hello, I have faced the same issue in a server environment. I have tried the solution proposed above, clearing the cache, and I still get the human output. I noticed that in the log section, the organism id is set again in human

ci <- get_collectri('mouse', split_complexes = FALSE)
[2023-09-11 11:00:02] [INFO]    [OmnipathR] Cache record does not exist: `https://omnipathdb.org/interactions?genesymbols=yes&datasets=collectri&organisms=9606&dorothea_levels=A,B&fields=evidences,sources,references,curation_effort&license=academic`
[2023-09-11 11:00:02] [INFO]    [OmnipathR] Retrieving URL: `https://omnipathdb.org/interactions?genesymbols=yes&datasets=collectri&organisms=9606&dorothea_levels=A,B&fields=evidences,sources,references,curation_effort&license=academic`
[2023-09-11 11:00:02] [TRACE]   [OmnipathR] Attempt 1/3: `https://omnipathdb.org/interactions?genesymbols=yes&datasets=collectri&organisms=9606&dorothea_levels=A,B&fields=evidences,sources,references,curation_effort&license=academic`
...

The session info output is the following:

> sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] OmnipathR_3.9.4 decoupleR_2.6.0

loaded via a namespace (and not attached):
 [1] rappdirs_0.3.3    utf8_1.2.3        generics_0.1.3    tidyr_1.3.0      
 [5] xml2_1.3.4        stringi_1.7.12    lattice_0.21-8    hms_1.1.3        
 [9] digest_0.6.31     magrittr_2.0.3    evaluate_0.21     grid_4.3.0       
[13] fastmap_1.1.1     cellranger_1.1.0  jsonlite_1.8.4    Matrix_1.5-4.1   
[17] progress_1.2.2    backports_1.4.1   httr_1.4.6        rvest_1.0.3      
[21] purrr_1.0.1       fansi_1.0.4       cli_3.6.1         rlang_1.1.1      
[25] crayon_1.5.2      bit64_4.0.5       withr_2.5.0       yaml_2.3.7       
[29] parallel_4.3.0    tools_4.3.0       tzdb_0.4.0        checkmate_2.2.0  
[33] dplyr_1.1.2       curl_5.0.0        vctrs_0.6.2       logger_0.2.2     
[37] R6_2.5.1          lifecycle_1.0.3   stringr_1.5.0     bit_4.0.5        
[41] vroom_1.6.3       pkgconfig_2.0.3   pillar_1.9.0      later_1.3.1      
[45] glue_1.6.2        Rcpp_1.0.10       xfun_0.39         tibble_3.2.1     
[49] tidyselect_1.2.0  knitr_1.43        htmltools_0.5.5   igraph_1.4.3     
[53] rmarkdown_2.22    readr_2.1.4       compiler_4.3.0    prettyunits_1.1.1
[57] readxl_1.4.2     

If you have any suggestion it would be really helpful, thank you.

PauBadiaM commented 1 year ago

Hi @ChrisTzaferis,

It could be that at that moment the omnipath server was down. Could you try again now? What do you think @deeenes ?

ChrisTzaferis commented 1 year ago

Hi @PauBadiaM , thank you for your response. I have also tried it in a local conda environment with R 4.3.0 and I have the same problem today.

omnipath_cache_wipe() [2023-09-18 11:41:59] [SUCCESS] [OmnipathR] Removing all cache contents from /home/tzafchris/.cache/OmnipathR.

library(decoupleR) library(OmnipathR) omnipath_set_console_loglevel('trace') mouse_com_false <- get_collectri('mouse', split_complexes = FALSE) [2023-09-18 11:42:48] [INFO] [OmnipathR] Cache record does not exist: https://omnipathdb.org/interactions?genesymbols=yes&datasets=collectri&organisms=9606&dorothea_levels=A,B&fields=evidences,sources,references,curation_effort&license=academic [2023-09-18 11:42:48] [INFO] [OmnipathR] Retrieving URL: https://omnipathdb.org/interactions?genesymbols=yes&datasets=collectri&organisms=9606&dorothea_levels=A,B&fields=evidences,sources,references,curation_effort&license=academic [2023-09-18 11:42:48] [TRACE] [OmnipathR] Attempt 1/3: https://omnipathdb.org/interactions?genesymbols=yes&datasets=collectri&organisms=9606&dorothea_levels=A,B&fields=evidences,sources,references,curation_effort&license=academic [2023-09-18 11:42:49] [TRACE] [OmnipathR] Reading JSON from /home/tzafchris/.cache/OmnipathR/cache.json (encoding: UTF-8). [2023-09-18 11:42:49] [TRACE] [OmnipathR] JSON validation successful: TRUE [2023-09-18 11:42:49] [TRACE] [OmnipathR] Reading JSON from /home/tzafchris/.cache/OmnipathR/cache.json (encoding: UTF-8). [2023-09-18 11:42:49] [TRACE] [OmnipathR] JSON validation successful: TRUE [2023-09-18 11:42:49] [INFO] [OmnipathR] Cache item 099e7af92f71a88d7560d65cea14c1970f66c0b6 version 1: status changed from unknown to started. [2023-09-18 11:42:49] [TRACE] [OmnipathR] Exporting object to RDS: /home/tzafchris/.cache/OmnipathR/099e7af92f71a88d7560d65cea14c1970f66c0b6-1.rds. [2023-09-18 11:42:50] [TRACE] [OmnipathR] Exported RDS to /home/tzafchris/.cache/OmnipathR/099e7af92f71a88d7560d65cea14c1970f66c0b6-1.rds. [2023-09-18 11:42:50] [INFO] [OmnipathR] Download ready [key=099e7af92f71a88d7560d65cea14c1970f66c0b6, version=1] [2023-09-18 11:42:50] [TRACE] [OmnipathR] Reading JSON from /home/tzafchris/.cache/OmnipathR/cache.json (encoding: UTF-8). [2023-09-18 11:42:50] [TRACE] [OmnipathR] JSON validation successful: TRUE [2023-09-18 11:42:50] [INFO] [OmnipathR] Cache item 099e7af92f71a88d7560d65cea14c1970f66c0b6 version 1: status changed from started to ready. [2023-09-18 11:42:50] [TRACE] [OmnipathR] Converting JSON column evidences to list. [2023-09-18 11:42:51] [TRACE] [OmnipathR] Restricting interaction records to datasets: collectri; and resources: any [2023-09-18 11:42:52] [TRACE] [OmnipathR] Filtering evidence columns: positive, negative, directed, undirected; to datasets: collectri; and resources: any [2023-09-18 11:43:11] [SUCCESS] [OmnipathR] Downloaded 64495 interactions. mouse_com_false

A tibble: 42,595 × 3 source target mor

1 MYC TERT 1 2 SPI1 BGLAP 1 3 SMAD3 JUN 1 4 SMAD4 JUN 1 5 STAT5A IL2 1 6 STAT5B IL2 1 7 RELA FAS 1 8 WT1 NR0B1 1 9 NR0B2 CASP1 1 10 SP1 ALDOA 1 ℹ 42,585 more rows ℹ Use `print(n = ...)` to see more rows
deeenes commented 1 year ago

I managed to reproduce the issue with decoupleR 2.6.0, which is the current release. With the development version (2.7.0) we see the correct behaviour, we get mouse data for mouse queries. @ChrisTzaferis I recommend you to update the decoupleR from github:

library(remotes)
remotes::install_github('saezlab/decoupleR')
ChrisTzaferis commented 1 year ago

Thank you @deeenes for your suggestion! Indeed with the version 2.7.0 I can get the mouse interactions. One last question, regarding the same issue, is if you get different number of interactions by setting the parameter complexes = True/False, because in my case the same number is retrieved in both cases.

deeenes commented 1 year ago

Briefly: it seems alright to me, there are no complexes in the mouse CollecTRI dataset.

More details:

CollecTRI contains almost no complexes: the only two TF complex present are the NFKB and the AP1. The parameter split_complexes does only this, I'm not sure about the idea behind that code. If you only want to control whether the interactions of those two complexes are included, you can use the entity_types argument from OmnipathR:

ci_complexes <- get_collectri('mouse', entity_types = c('complex', 'protein'))
ci_only_proteins <- get_collectri('mouse', entity_types = 'protein')

However, the two data frames, as processed and used by decoupleR, will be identical, because it uses only gene symbols, and the name of the complexes will match certain gene symbols. In addition, those two complexes from CollecTRI are getting lost in human → mouse translation. They are both major master TFs, with about 23k interactions between ~20 TF varieties and 1.1k target genes in human.

library(OmnipathR)

ci_p_h <- collectri(entity_types = 'protein')
ci_p_m <- collectri(organism = 10090L, entity_types = 'protein')
ci_c_h <- collectri(entity_types = c('complex', 'protein'))
ci_c_m <- collectri(organism = 10090L, entity_types = c('complex', 'protein'))
ChrisTzaferis commented 1 year ago

Thank you very much @deeenes for your time and your help!