aertslab / create_cisTarget_databases

Create cisTarget databases
37 stars 8 forks source link

Can I apply cisTarget to estimate the RBP-regulon? #20

Open kerenzhou062 opened 2 years ago

kerenzhou062 commented 2 years ago

Hi,

I have binding sites of RNA binding proteins (RBP) analyzed from CLIP-seq data, so I want to do the motif enrichment and RBP-regulon prediction, which's used as input for SCENIC. How can I do this? BTW, I also analyzed the motif enrichment by using HOMER, which motif format was PWM like bellow;

>TGCATG 1-TGCATG,BestGuess:hsa-miR-4262 MIMAT0016894 Homo sapiens miR-4262 Targets (miRBase)(0.647) 5.179177    -34261.033795   0   T:22354.0(48.77%),B:2355.3(5.54%),P:1e-14879
0.001   0.044   0.001   0.954
0.001   0.001   0.997   0.001
0.001   0.997   0.001   0.001
0.997   0.001   0.001   0.001
0.001   0.001   0.001   0.997
0.001   0.001   0.997   0.001

Best, Keren

ghuls commented 2 years ago

SCENIC is designed for doing motif enrichment of proteins that bind to DNA (transcription factors), so using it for RBPs probably does not make much sense.

kerenzhou062 commented 2 years ago

SCENIC is designed for doing motif enrichment of proteins that bind to DNA (transcription factors), so using it for RBPs probably does not make much sense.

Thank you for your information!

Though SCENIC is designed for TFs, in my opinion, TFs and RBPs are quite similar in regulating their targets. RBPs also exert their functions by recognizing targets via motifs. Could you please explain more about why do you think that SCENIC is not suitable for RBPs?

Best,

Keren

ghuls commented 2 years ago

You will need to change at least the the first step of pySCENIC (gene regulatory network) with something that makes sens for RBPs.

Probably you are aware, but if not, CISBP-RNA has a number of RBPs: http://cisbp-rna.ccbr.utoronto.ca/ You will need to rescale those motifs to count matrices of 100.

kerenzhou062 commented 2 years ago

You will need to change at least the the first step of pySCENIC (gene regulatory network) with something that makes sens for RBPs.

Probably you are aware, but if not, CISBP-RNA has a number of RBPs: http://cisbp-rna.ccbr.utoronto.ca/ You will need to rescale those motifs to count matrices of 100.

Thank you for your suggestions!

Do you mean that I need to filter the GRNs with a more reasonable cutoff for the Pearson product moment correlation (default ρ ≥ +0.03 for positive and ρ ≤ −0.03 for negative) )? As known to all, RBPs usually directly binds to their targets and influence their stability, so it may be acceptable to run pySCENIC with a more stringent cutoff, like 0.1 or higher?

Best,

Keren

ghuls commented 2 years ago

If you know which RBPs bind to which targets, you won't need the GRN related code of pySCENIC as there the TF to target gene relation is inferred and not based on known TF to target gene relations.

Playing the the cutoff will be likely necessary.

kerenzhou062 commented 2 years ago

If you know which RBPs bind to which targets, you won't need the GRN related code of pySCENIC as there the TF to target gene relation is inferred and not based on known TF to target gene relations.

Playing the the cutoff will be likely necessary.

Yeah, we actually can get the RBP-target relationships, but it's really hard to rank them, which's required for Module Generation (Step 6). Also, like TF, RBP can repress their targets which is hard to be reflected from binding information. So, in my opinion, the construction of GRNs is still necessary.

To improve the prediction accuracy, filtering the GRNs by RBP-target relationships before Module Generation step may be a good strategy?

Best,

Keren