aertslab / pySCENIC

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
420 stars 179 forks source link

identifying TF regulons for conditions in scRNAseq within each of the cell types[results] #189

Closed deevdevil88 closed 4 years ago

deevdevil88 commented 4 years ago

Hi, I am interested in identifying TF regulons that might be different across cell types ( the pyscenic tutorials show this) , but also across conditions in each cell type. So i wondered what was the best way to go about the analysis. As the AUC ranking is based on all cells, i thought its best to analyse the entire dataset as a whole so that AUC scores are comparable across cell types and then do a regulon specificity score (RSS) for conditions after either subsetting the results by cell type or simply by calculating the RSS by inputing a column which is a merge of cell type + condition. I wasnt sure what is the best way to go about this.

Best, Devika

cflerin commented 4 years ago

Hi @deevdevil88 ,

I just answered maybe part of your question in #185 . Generally, I would run conditions separately, as long as there aren't an excess number of them, then also run a complete dataset (again as long as the total number of cells is reasonable). So you could have something like tumor, normal and tumor+normal.

For RSS, the metric is set up to compare one cluster of cells vs all others. Probably the ideal way to do this is to use a clustering (Louvain, etc.) which is based on the SCENIC AUC UMAP. Since this is generated based on the AUCell matrix it represents the underlying regulons and cell activity. With this approach, I would look to see if your conditions form clusters in this UMAP/tSNE -- this will indicate if there are specific regulons, which could then be identified with the RSS. You can also use celltype+condition (as you already said) as a group input -- it's also valid but it may not produce as high RSS as the pure clusters since there might be mixing of multiple celltype+condition groups that would make it harder to identify specific regulons.

deevdevil88 commented 4 years ago

Great! I will try these out!

Devika

On Tue, 28 Jul 2020, 09:03 Chris Campbell Flerin, notifications@github.com wrote:

Hi @deevdevil88 https://github.com/deevdevil88 ,

I just answered maybe part of your question in #185 https://github.com/aertslab/pySCENIC/issues/185 . Generally, I would run conditions separately, as long as there aren't an excess number of them, then also run a complete dataset (again as long as the total number of cells is reasonable). So you could have something like tumor, normal and tumor+normal.

For RSS, the metric is set up to compare one cluster of cells vs all others. Probably the ideal way to do this is to use a clustering (Louvain, etc.) which is based on the SCENIC AUC UMAP. Since this is generated based on the AUCell matrix it represents the underlying regulons and cell activity. With this approach, I would look to see if your conditions form clusters in this UMAP/tSNE -- this will indicate if there are specific regulons, which could then be identified with the RSS. You can also use celltype+condition (as you already said) as a group input -- it's also valid but it may not produce as high RSS as the pure clusters since there might be mixing of multiple celltype+condition groups that would make it harder to identify specific regulons.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aertslab/pySCENIC/issues/189#issuecomment-664844467, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIBRYEVQJXAVZ5CD26JLEYTR52A3ZANCNFSM4PGUL26A .