ZhangLabGT / scDisInFact

scDisInFact is a single-cell data integration and condition effect prediction framework
GNU General Public License v3.0
7 stars 2 forks source link

Is scDisInFact suitable for analysis scRNA-seq data only including 3 samples? #2

Closed AiweiWu closed 3 months ago

AiweiWu commented 3 months ago

Very nice work! We have performed scRNA-seq on 1 case and 1 control samples recently (batch1), and we alos have 1 scRNA-seq data for 1 control generated before (batch2). Is scDisInFact suitable for analysis this scRNA-seq dataset to remove batch effects and robustly get the true changes in case versus control?

PeterZZQ commented 3 months ago

Hi,

Thank you for your interest in your work. Sorry for the late response.

Let me repeat the experiment setting here: you have 2 batches of scRNA-seq dataset. You have 2 samples in batch 1 under 2 conditions: control & case, and you also have 1 sample in batch 2 corresponding to control condition. Is it right? If I'm understanding it correctly, this is exactly the setting scDisInFact is designed for.

You can train the dataset all together using scDisInFact, just remember to assign the correct batch ID and condition label for each sample when using scDisInFact and you should be fine. You can check the tutorial for the detailed running code.

After you trained the model, there are two ways to study the changes in case and control:

AiweiWu commented 3 months ago

Hi,

Thank you for your interest in your work. Sorry for the late response.

Let me repeat the experiment setting here: you have 2 batches of scRNA-seq dataset. You have 2 samples in batch 1 under 2 conditions: control & case, and you also have 1 sample in batch 2 corresponding to control condition. Is it right? If I'm understanding it correctly, this is exactly the setting scDisInFact is designed for.

You can train the dataset all together using scDisInFact, just remember to assign the correct batch ID and condition label for each sample when using scDisInFact and you should be fine. You can check the tutorial for the detailed running code.

After you trained the model, there are two ways to study the changes in case and control:

  • One is to directly check the top-scoring genes generated by scDisInFact, but keep in mind that it select the genes that changes on the whole population level.
  • To see the heterogeneity of changes across different cell type, for each cell under control condition, you can use scDisInFact to predict the corresponding gene expression data of the cell under the perturbed condition. Then calculate difference of gene expression data for each cell (perturbed minus control) and average the difference across cells within the cell type that you are interested. The genes that have a larger averaged difference should be the genes that are contributing the most to the condition changes within that cell type.

Thank you so much for you reply and the further detailed explanation. I'll try it to find the global and cell type specific expression changes.