saezlab / CollecTRI

Gene regulatory network containing signed transcription factor-target gene interactions
GNU General Public License v3.0
66 stars 6 forks source link

Loading mouse regulons returns human regulons #14

Closed annasen closed 1 month ago

annasen commented 10 months ago

Hello, I am using CollecTRI to study mouse genome, when I run in R net <- decoupleR::get_collectri(organism = "mouse", split_complexes = FALSE), I receive the same output as running get_collectri for human. I noticed this because gene symbols in the net file are all capital.

PauBadiaM commented 9 months ago

Hi @annasen ,

Could you please install the latest versions for decoupleR and OmnipathR and try again?

remotes::install_github('saezlab/omnipathr')
remotes::install_github('saezlab/decoupleR')

Hope this is helpful!

annasen commented 9 months ago

Yes it worked, thank you!

Fredspo commented 9 months ago

Hi @PauBadiaM ,

I am posting on this issue because we encountered something strange while using collectRI for mouse: net <- decoupleR::get_dorothea(organism = "mouse" ,levels = c('A', 'B’)) net <- decoupleR::get_collectri(organism = "mouse", split_complexes = FALSE)

Basically, some genes do not seem to represent the mouse ortholog, for example in the net dataframe we found that the gene for the p53 protein is not the correct one. TP53 is the human gene whereas the mouse ortholog is Trp53 : https://www.informatics.jax.org/marker/MGI:98834 image

PauBadiaM commented 9 months ago

@deeenes, can you have a look at this?

deeenes commented 9 months ago

Hello,

We use the primary gene symbols provided by UniProt, as we see here, Tp53 is the primary symbol for P53_MOUSE (the SwissProt record linked also from here). It's UniProt's decision which authority they rely on for each species to decide about the primary symbols. This gene is the mouse ortholog of TP53, and CollecTRI data is translated from human by orthologous gene pairs, that's why we have this symbol in the table.

I would recommend to translate your gene name synonyms to primary uniprots in your data:

library(OmnipathR)
library(magrittr)

'Trp53' %>%
translate_ids(genesymbol_syn, uniprot, organism = 10090) %>%
translate_ids(uniprot, genesymbol, organism = 10090)
[1] "Tp53"

Use the function above on the complete vector or data frame column, as shown here. Please also update OmnipathR because I've just found and fixed a bug in translate_ids right now.

Fredspo commented 8 months ago

Hello @deeenes ,

Thank you for your reply.

I forced a re-install of the OmnipathR package and ran your snippet to do an initial test. However, I unfortunately did not get the same result. Could there be an additional bug?

Screenshot 2024-03-28 at 08 30 29
deeenes commented 8 months ago

It's not a bug, I rather suspect an old item in the cache. Easiest to try with an empty cache:

library(OmnipathR)
library(magrittr)

omnipath_set_cachedir(tempdir())

'Trp53' %>%
translate_ids(genesymbol_syn, uniprot, organism = 10090) %>%
translate_ids(uniprot, genesymbol, organism = 10090)
[1] "Tp53"

If it succeeds with the empty cache, you can empty your default cache (~/.cache/OmnipathR) or delete the relevant items. If the above doesn't return "Tp53", please share the following outputs:

The version of the loaded package:

packageVersion('OmnipathR')
[1] ‘3.11.10’

The code of this recently updated function:

OmnipathR::uniprot_full_id_mapping_table

The log trace of the discussed operation:

library(OmnipathR)
library(magrittr)

omnipath_set_cachedir(tempdir())
omnipath_set_console_loglevel('trace')

'Trp53' %>%
translate_ids(genesymbol_syn, uniprot, organism = 10090) %>%
translate_ids(uniprot, genesymbol, organism = 10090)
deeenes commented 3 months ago

An update on this issue, we introduced the genesymbol_resource argument, which is "uniprot" by default, and alternatively, can be set to "ensembl". In this case, the gene symbols are updated to the ones used in Ensembl, for example, Trp53 instead of Tp53.

The original issue is fixed already, I think this can be closed.

ASNbioinf commented 1 month ago

My problem is that the function "get_collectri" may not exist

remotes::install_github('saezlab/omnipathr')
remotes::install_github('saezlab/decoupleR')
library(decoupleR)
library(OmnipathR)
net <- get_collectri(organism='human', split_complexes=FALSE)
net

Error in get_collectri(organism = "human", split_complexes = FALSE): could not find function "get_collectri"

gabora commented 1 month ago

@ASNbioinf installation failed probably. Check the output from install_github and loading the libraries.

deeenes commented 1 month ago

@ASNbioinf This is an old issue that has nothing to do with your current problem. Please open a new issue where you include the relevant information, especially your sessionInfo and any error you encounter