saezlab / dorothea

R package to access DoRothEA's regulons
https://saezlab.github.io/dorothea/
GNU General Public License v3.0
132 stars 26 forks source link

Regulons for cancer datasets #20

Closed lxw391 closed 4 years ago

lxw391 commented 4 years ago

My question is given the TCGA regulons is located in deprecated branch https://github.com/saezlab/dorothea/blob/deprecated/data/TFregulons/consensus/Robjects_VIPERformat/pancancer/TOP10_pancancer_viperRegulon.rdata

can people still use them for their papers? also how might one find the confidence level for the TCGA regulons?

christianholland commented 4 years ago

Hi @lxw391,

did you close this issue because you solved your question on your own?

In any case here my brief answers:

i) The TCGA regulons also contain confidence level, even though they are a bit convoluted. The notation is as follows: TF_confidence level. E.g. "AR_A" means that the TF AR has confidence level A.

ii) Yes you can certainly use these regulons for your project. For now the TCGA regulons are in the deprecated branch because we totally refactored Dorothea und focussed for now on the GTEx based regulons. For the next Bioconductor release we might consider to add also the TCGA regulons to the package

lxw391 commented 4 years ago

Thank you very much for the prompt reply @christianholland! That is very helpful comments, as we're still trying to decide which database to use. My last question (I hope) is how are these TCGA regulons you sent different from those in Suppl Table 4? we compared and found they are different. Thanks again.

christianholland commented 4 years ago

The regulons similar to Table S4 you can find here: https://github.com/saezlab/dorothea/blob/deprecated/data/TFregulons/consensus/table/database_pancancer_20181030.csv.zip

There we list really every single interaction that we have found in any from our four lines of evidence. Based on the number of supporting evidence each interaction gets a confidence level assigned (see Figure 5A in the manuscript).

"To provide the most confident regulon for each TF, we then aggregated the TF-target interactions with the highest possible confidence score that resulted in a regulon equal to or greater than ten targets. The final confidence score assigned to the TF regulon is the lowest confidence score of its component targets". (taken from the manuscript)

I hope this answers your question. Don't hesitate to come back to me if its not clear

lxw391 commented 4 years ago

Sounds great, we really appreciate all your help.