aertslab / pySCENIC

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
420 stars 179 forks source link

How to update motif-to-TF annotation database with additional TFs #161

Closed fbrundu closed 4 years ago

fbrundu commented 4 years ago

I'm using the pySCENIC pipeline and I noticed that some of the TFs I'm interested in are not present in the motif-to-TF annotation database (Column "gene_name", Human motif collection v9, downloaded from https://resources.aertslab.org/cistarget/). Is this database compiled through a part of the pySCENIC pipeline, or from a different tool?

Thanks

cflerin commented 4 years ago

Hi @fbrundu ,

The databases are not generated with pySCENIC, but are part of the cisTarget tool. Which TFs in particular are you interested in? It would be interesting to see why they are missing.

fbrundu commented 4 years ago

Thanks! I was interested in particular on this list of TFs: HIC2, HIRA, LZTR1, MED15, ZNF74, TBX1 But in general, what's the process for selecting or filtering TFs?

fbrundu commented 4 years ago

Hi @cflerin, is there a reference documentation that I can use to include these or additional TFs, or to see why they are excluded? Should I look over the cisTarget tool? Thanks!

cflerin commented 4 years ago

Hi @fbrundu , sorry for the delay. Of the TFs you listed, only HIC2, TBX1, and ZNF74 are present in the cisTarget databases. So I wouldn't expect to find the others at all in a pySCENIC analysis. If you're not finding the ones that are present, you could try lowering the pruning thresholds, or running a "manual" analysis of the expression modules in iRegulon (see here, the section "iRegulon analysis" for relevant links). The manual analysis is good for looking at a small number of regulons in detail.

If you want more information on cisTarget, you can look at the two papers describing the details: paper 1, and paper 2

fbrundu commented 4 years ago

Hi @cflerin , thank you very much. I will look into the paper you sent and the iRegulon manual analysis, that's very helpful! I will also try to understand if the TFs that are missing in the cisTarget databases can be introduced in some way, so if you have any information please let me know. Thank you for your help!

fbrundu commented 3 years ago

I have an additional question regarding pySCENIC and iRegulon. I ran both and I get slightly different results (which is expected). However, does it make sense to intersect the two results or I should only consider the results from the newer pySCENIC? Thanks!

XYZuo commented 3 years ago

Hi,@cflerin and @fbrundu! In this reply I noticed you could lower the pruning thresholds. I wonder how to do it. Thank you if anybody could let me know!

Hi @fbrundu , sorry for the delay. Of the TFs you listed, only HIC2, TBX1, and ZNF74 are present in the cisTarget databases. So I wouldn't expect to find the others at all in a pySCENIC analysis. If you're not finding the ones that are present, you could try lowering the pruning thresholds, or running a "manual" analysis of the expression modules in iRegulon (see here, the section "iRegulon analysis" for relevant links). The manual analysis is good for looking at a small number of regulons in detail.

If you want more information on cisTarget, you can look at the two papers describing the details: paper 1, and paper 2