saezlab / dorothea

R package to access DoRothEA's regulons
https://saezlab.github.io/dorothea/
GNU General Public License v3.0
132 stars 26 forks source link

Reliability of mode of regulation #44

Closed asumann closed 2 years ago

asumann commented 3 years ago

Hi,

I wonder if it makes sense to create a Cytoscape TF-target network based on Dorothea interactions. Especially, what do you think of using MoR for edge attributes? I assume it is reliable to use MoR information yet, the question comes from possibly contradictory fact included in the paper:

Here, when the MoR of the TF–target interaction was not defined by the original data set (i.e., those derived from TFBS predictions, ChIP-seq data and most of the curated databases), we assumed a positive regulatory effect of the TF on the target. However, if the TF is known to be a global repressor (data extracted from UniProt) (Supplemental Table S1), the interactions are assumed to have a negative regulatory effect.

How about using Omnipath TF-target interactions and filtering by curation effort >2 ?

Thanks for the resource!

christianholland commented 3 years ago

Hi @asumann,

thanks for your interest in dorothea.

It should be fine to use the mode of regulations as edge attributes since we also consider the mode of regulation in our benchmark studies of dorothea. But please keep in mind that we only had for a very small fraction of TF-target interactions reliable information about the mode of regulation, for almost all other we made a strong assumption and assigned a positive regulation.

possibly contradictory fact included in the paper

Could you please elaborate what you mean with that

How about using Omnipath TF-target interactions and filtering by curation effort >2 ?

The regulons of omnipath are coming from dorothea. What do you mean by curation effort? That the interaction was reported in two curated resources?

asumann commented 3 years ago

Could you please elaborate what you mean with that I thought MoR could be contradictory without a given likelihood of an interaction, or having a strong assumption you mentioned.

Using OmnipathR import_all_interactions(resources="DoRothEA"), I get a dataframe with curation_effort column. In the paper it is defined as "unique reference-interaction pairs". To my understanding, it shows evidence for the interaction apart from Dorothea confidence levels. If that is right, I thougt having additional filtering by curation effort would give more support about MoR.

Sorry, if this is a huge misunderstanding..

christianholland commented 3 years ago

Hi @asumann,

please apologise the late reply. I was not too much involved in the development of OmnipathR but maybe @deeenes or @alberto-valdeolivas can help here?

deeenes commented 3 years ago

Hi Asuman,

I can second Christian that wherever effect signs are available, it's better to use them, even if the coverage is far from complete. It's just better to have incomplete information than nothing.

In the OmniPath interactions database there are two datasets of gene regulatory interactions: "dorothea" and "tf_target". The latter consists of mostly literature curated resources integrated directly into OmniPath. Most of these resources are also part of DoRothEA, although processed by different code, so they don't perfectly overlap. With OmnipathR you can download the tf_target dataset by import_tf_target_interactions:

library(magrittr)
library(dplyr)
library(OmnipathR)
tf_target <-
    import_tf_target_interactions() %T>%
    {print(nrow(.))} %T>%
    {filter(., grepl('DoRothEA', sources)) %>% nrow %>% print}
# [1] 61930
# [1] 15347

As you see, the size of this dataset is ~62k interactions, out of which ~15k can be found also in the A-D levels of DoRothEA.

In OmnipathR, DoRothEA A-D levels are available by import_dorothea_interactions. This dataset includes ~280k interactions:

library(magrittr)
library(OmnipathR)
dorothea <-
    import_dorothea_interactions(dorothea_levels = c('A', 'B', 'C', 'D')) %T>%
    {print(nrow(.))}
# [1] 279590

To download the two datasets together, you can use the import_transcriptional_interactions function:

library(magrittr)
library(OmnipathR)
transcriptional <-
    import_transcriptional_interactions(dorothea_levels = c('A', 'B', 'C', 'D'), fields = 'datasets') %T>%
    {print(nrow(.))}
# [1] 326173

The import_all_interactions function includes also PPI and miRNA interactions. The curation_effort column is an optional column in OmniPath's interactions query. It corresponds to the number of resource-reference pairs supporting one interaction, for interactions with no literature references it's zero.

Best,

Denes