saezlab / decoupleR

R package to infer biological activities from omics data using a collection of methods.
https://saezlab.github.io/decoupleR/
GNU General Public License v3.0
183 stars 24 forks source link

Using decoupleR across conditions #67

Closed singhh5050 closed 1 year ago

singhh5050 commented 1 year ago

Hi! I'm using a scRNA-seq dataset with cells from healthy breast tissue (normal condition) and cells from cancerous breast tissue (cancer condition). How would I go about performing pathway activity inference and transcription factor activity inference across conditions? I've followed the scRNA-seq vignettes on GitHub but I can't seem to figure out how I can look for pathways/TFs that have higher levels in cancer cells compared to healthy cells, for example. Is there a way to output pathways and TFs that are upregulated specifically in the cancer phenotype? Also, if I was interested in looking at biological activities in specific cell types, is there a way to do that? Thanks for your help!

PauBadiaM commented 1 year ago

Hi @singhh5050,

Sorry for the delayed response, I was on holidays.

To infer activities across conditions it's better to use contrast statistics. If you have enough samples (meaning patients, not number of cells), you can pseudobulk your samples per cell type and compute differential expression analysis across conditions and cell types (healthy vs disease) using your favorite statistical tool (deseq2, limma, edgeR, simple t-tests, etc.). The obtained statistic, for example t-values, can be used as input to decoupler. Here is an example vignette, it's in python but the concept behind is the same.

In case you don't have enough true replicates (samples) and cannot pseudobulk your data, another way would be to estimate activities at the cell level, and then compare the activity between populations of cells between conditions per cell type.

Hope this is helpful! If you have more questions do not hesitate to ask.

singhh5050 commented 1 year ago

Hi,

Is there code to run this in R?

On Mon, Jan 9, 2023 at 9:58 AM Pau Badia i Mompel @.***> wrote:

Hi @singhh5050 https://github.com/singhh5050,

Sorry for the delayed response, I was on holidays.

To infer activities across conditions it's better to use contrast statistics. If you have enough samples (meaning patients, not number of cells), you can pseudobulk your samples per cell type and compute differential expression analysis across conditions and cell types (healthy vs disease) using your favorite statistical tool (deseq2, limma, edgeR, simple t-tests, etc.). The obtained statistic, for example t-values, can be used as input to decoupler. Here is an example vignette https://decoupler-py.readthedocs.io/en/latest/notebooks/pseudobulk.html, it's in python but the concept behind is the same.

In case you don't have enough true replicates (samples) and cannot pseudobulk your data, another way would be to estimate activities at the cell level, and then compare the activity between populations of cells between conditions per cell type.

Hope this is helpful! If you have more questions do not hesitate to ask.

— Reply to this email directly, view it on GitHub https://github.com/saezlab/decoupleR/issues/67#issuecomment-1375851869, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQESH2VZXKFMDHPDTXJ42KDWRQYSLANCNFSM6AAAAAATEWKCYY . You are receiving this because you were mentioned.Message ID: @.***>

PauBadiaM commented 1 year ago

Hi @singhh5050,

Unfortunately not yet. For single-cell we have decided to focus more in the python version for its better scalability compared to R. If I have time in the future I'll try to add more vignettes but this could take a while. You could have a look at the muscat vignette, since it does pseudobulk and differential testing, then you could use these results as input to decoupler.

Hope this is helpful!