Closed slowkow closed 4 years ago
I think I'm starting to get it... there are a lot of functions to explore.
Here's what I have now:
d <- import_LigrecExtra_Interactions(select_organism = 9606)
d[1:5,1:5]
source target source_genesymbol target_genesymbol is_directed
1 P46531 Q9Y219 NOTCH1 JAG2 1
2 Q9Y219 P46531 JAG2 NOTCH1 1
3 O00548 P46531 DLL1 NOTCH1 1
4 P46531 O00548 NOTCH1 DLL1 1
5 P05019 P08069 IGF1 IGF1R 1
Looks great!
Hi @slowkow,
The ligrecextra
is a dataset within the interaction
database of OmniPath, containing interactions from resources dedicated to ligand-receptor relationships but providing no literature references. Ligand-receptor and other cell-cell interactions might be part of other datasets so I wouldn't recommend to use only this one.
The import_omnipath_intercell
function retrieves the intercellular communication role annotations (the intercell
database of OmniPath).
OmnipathR has a function to combine these annotations with the interactions to build a network of intercellular communication:
icn <- import_intercell_network()
This function provides a great flexibility, I recommend to read its docs, it passes the parameters to import_omnipath_interactions
and import_omnipath_intercell
.
Best,
Denes
Thanks for the tip, @deeenes
Here's what I get:
> icn <- OmnipathR::import_Omnipath_intercell()
Downloaded 267508 intercell records
> icn
category parent database scope aspect
1 transmembrane transmembrane UniProt_location generic locational
2 transmembrane transmembrane UniProt_location generic locational
3 transmembrane transmembrane UniProt_location generic locational
4 transmembrane transmembrane UniProt_location generic locational
5 transmembrane transmembrane UniProt_location generic locational
6 transmembrane transmembrane UniProt_location generic locational
7 transmembrane transmembrane UniProt_location generic locational
8 transmembrane transmembrane UniProt_location generic locational
9 transmembrane transmembrane UniProt_location generic locational
10 transmembrane transmembrane UniProt_location generic locational
11 transmembrane transmembrane UniProt_location generic locational
12 transmembrane transmembrane UniProt_location generic locational
13 transmembrane transmembrane UniProt_location generic locational
source uniprot genesymbol entity_type consensus_score transmitter
1 resource_specific Q8TDQ1 CD300LF protein 7 False
2 resource_specific Q02223 TNFRSF17 protein 8 False
3 resource_specific Q7Z3J2 C16orf62 protein 4 False
4 resource_specific Q14C87 TMEM132D protein 6 False
5 resource_specific Q8N5U1 MS4A15 protein 5 False
6 resource_specific Q9Y6I8 PXMP4 protein 6 False
7 resource_specific P12821 ACE protein 8 False
8 resource_specific Q96RI9 TAAR9 protein 5 False
9 resource_specific P16109 SELP protein 8 False
10 resource_specific Q04656 ATP7A protein 7 False
11 resource_specific B6A8C7 TARM1 protein 5 False
12 resource_specific P00846 MT-ATP6 protein 5 False
13 resource_specific O00258 WRB protein 6 False
receiver secreted plasma_membrane_transmembrane plasma_membrane_peripheral
1 False False True False
2 False False True False
3 False False False False
4 False False False False
5 False False False False
6 False False False False
7 False True True False
8 False False True False
9 False False True False
10 False False False False
11 False False True False
12 False False False False
13 False False False False
[ reached 'max' / getOption("max.print") -- omitted 267495 rows ]
This looks interesting, but I don't see how we can convert this to gene pairs. Am I missing something?
Sorry, it seems that this table has gene pairs, but I didn't find them until I started poking around.
> icn %>% dplyr::filter(str_detect(uniprot, "_"), consensus_score > 10)
category parent database scope aspect source
1 ligand ligand Matrisome generic functional resource_specific
2 ligand ligand Matrisome generic functional resource_specific
3 ligand ligand Matrisome generic functional resource_specific
4 ligand ligand iTALK generic functional resource_specific
5 ligand ligand iTALK generic functional resource_specific
6 ligand ligand iTALK generic functional resource_specific
7 ligand ligand EMBRACE generic functional resource_specific
8 ligand ligand EMBRACE generic functional resource_specific
9 ligand ligand EMBRACE generic functional resource_specific
10 ligand ligand HGNC generic functional resource_specific
11 ligand ligand HGNC generic functional resource_specific
12 ligand ligand HGNC generic functional resource_specific
13 ligand ligand CellPhoneDB generic functional resource_specific
uniprot genesymbol entity_type consensus_score
1 COMPLEX:P20783_P23560 COMPLEX:BDNF_NTF3 complex 12
2 COMPLEX:P08476_P09529 COMPLEX:INHBA_INHBB complex 12
3 COMPLEX:P26441_Q9UBD9 COMPLEX:CLCF1_CNTF complex 11
4 COMPLEX:P08476_P09529 COMPLEX:INHBA_INHBB complex 12
5 COMPLEX:P20783_P23560 COMPLEX:BDNF_NTF3 complex 12
6 COMPLEX:P26441_Q9UBD9 COMPLEX:CLCF1_CNTF complex 11
7 COMPLEX:P20783_P23560 COMPLEX:BDNF_NTF3 complex 12
8 COMPLEX:P08476_P09529 COMPLEX:INHBA_INHBB complex 12
9 COMPLEX:P26441_Q9UBD9 COMPLEX:CLCF1_CNTF complex 11
10 COMPLEX:P08476_P09529 COMPLEX:INHBA_INHBB complex 12
11 COMPLEX:P20783_P23560 COMPLEX:BDNF_NTF3 complex 12
12 COMPLEX:P26441_Q9UBD9 COMPLEX:CLCF1_CNTF complex 11
13 COMPLEX:P08476_P09529 COMPLEX:INHBA_INHBB complex 12
transmitter receiver secreted plasma_membrane_transmembrane
1 True False True False
2 True False True False
3 True False True False
4 True False True False
5 True False True False
6 True False True False
7 True False True False
8 True False True False
9 True False True False
10 True False True False
11 True False True False
12 True False True False
13 True False True False
plasma_membrane_peripheral
1 False
2 False
3 False
4 False
5 False
6 False
7 False
8 False
9 False
10 False
11 False
12 False
13 False
[ reached 'max' / getOption("max.print") -- omitted 625 rows ]
This is very useful! I like that each pair is annotated with some information about the database!
Could I please ask if you might comment on the consensus_score
column? I can't find how this value is defined.
Hi,
You should use the import_intercell_network()
instead of import_omnipath_intercell()
. The former combines 2 intercell annotation tables with one network table, while the latter only provides one intercell annotation table.
So first do like this (optionally a with custom parameters):
icn <- import_intercell_network()
This data frame has 44 columns, as it is combined from 3 data frames, maybe some of them are redundant. Some of the important ones:
The consensus_score
for intercell annotations is the number of resources supporting a certain annotation; it is comparable only within category because the number of total resources is different for each category (e.g. if we have 9 resources describing ligands, and only 2 of them annotates a protein as ligand then it's a low value, however if we have 2 resources for protease inhibitors and both of them annotates a protein as such then it's a high number).
I hope this helps.
Best,
Denes
Here are the functions provided by OmnipathR_1.2.1 in my R session:
> OmnipathR::
OmnipathR::get_annotation_databases OmnipathR::import_Omnipath_PTMS
OmnipathR::get_complex_genes OmnipathR::import_Omnipath_annotations
OmnipathR::get_complexes_databases OmnipathR::import_Omnipath_complexes
OmnipathR::get_interaction_databases OmnipathR::import_Omnipath_intercell
OmnipathR::get_intercell_categories OmnipathR::import_PathwayExtra_Interactions
OmnipathR::get_intercell_classes OmnipathR::import_TFregulons_Interactions
OmnipathR::get_ptms_databases OmnipathR::import_miRNAtarget_Interactions
OmnipathR::get_signed_ptms OmnipathR::interaction_graph
OmnipathR::import_AllInteractions OmnipathR::printPath_es
OmnipathR::import_KinaseExtra_Interactions OmnipathR::printPath_vs
OmnipathR::import_LigrecExtra_Interactions OmnipathR::print_interactions
OmnipathR::import_Omnipath_Interactions OmnipathR::ptms_graph
> OmnipathR::import_intercell_network
Error: 'import_intercell_network' is not an exported object from 'namespace:OmnipathR'
> OmnipathR:::import_intercell_network
Error in get(name, envir = asNamespace(pkg), inherits = FALSE) :
object 'import_intercell_network' not found
After installing the version from GitHub (OmnipathR_1.3.7), now I have the function available.
The result looks excellent! Thank you so much.
> icn %>% filter(source_genesymbol == "CXCL13") %>% head(1) %>% t
[,1]
category_intercell_source "ligand"
parent_intercell_source "ligand"
source "O43927"
target "O00574"
category_intercell_target "receptor"
parent_intercell_target "receptor"
target_genesymbol "CXCR6"
source_genesymbol "CXCL13"
is_directed "1"
is_stimulation "1"
is_inhibition "0"
consensus_direction "1"
consensus_stimulation "1"
consensus_inhibition "0"
dip_url ""
sources "Wang"
references ""
curation_effort "0"
n_references "0"
n_resources "1"
database_intercell_source "Matrisome;iTALK;HGNC;CellPhoneDB;GO_Intercell;HPMR;ICELLNET;Ramilowski2015;Kirouac2010;Guide2Pharma;LRdb;Baccin2019;OmniPath"
scope_intercell_source "generic"
aspect_intercell_source "functional"
category_source_intercell_source "resource_specific"
genesymbol_intercell_source "CXCL13"
entity_type_intercell_source "protein"
consensus_score_intercell_source "12"
transmitter_intercell_source "TRUE"
receiver_intercell_source "FALSE"
secreted_intercell_source "TRUE"
plasma_membrane_transmembrane_intercell_source "FALSE"
plasma_membrane_peripheral_intercell_source "FALSE"
database_intercell_target "iTALK;Almen2009;CellCellInteractions;EMBRACE;HGNC;CellPhoneDB;GO_Intercell;HPMR;ICELLNET;Surfaceome;Ramilowski2015;Kirouac2010;Guide2Pharma;LRdb;Baccin2019;OmniPath"
scope_intercell_target "generic"
aspect_intercell_target "functional"
category_source_intercell_target "resource_specific"
genesymbol_intercell_target "CXCR6"
entity_type_intercell_target "protein"
consensus_score_intercell_target "15"
transmitter_intercell_target "FALSE"
receiver_intercell_target "TRUE"
secreted_intercell_target "FALSE"
plasma_membrane_transmembrane_intercell_target "TRUE"
plasma_membrane_peripheral_intercell_target "FALSE"
Thank you for describing the consensus score!
Hello,
It seems that you are using the release version from Bioconductor. That version does not yet have all the functionalities (Bioconductor will be upated soon, by the end of October). Until then, you can install the package from this Github repo (version 1.3.7)
Best, Alberto.
Closing bc looks like the bioc vs. dev version caused the confusion. Feel free to reopen or comment if you have any further question.
Could I please ask if OmnipathR provides access to a list of gene pairs?
That is, I would like to get a dataframe where each row corresponds to a pair of genes (or proteins), e.g., CXCL10 and CXCR3.
So far, here's what I could figure out from the vignette:
Notice that CXCR3 is listed by itself, but there is no hint that CXCL10 is a ligand for this receptor.
Is there some way to run OmniPathR to get the known pairs of proteins that interact?
I'll keep looking in the documentation, but I would greatly appreciate any hints or tips!
Thank you.