quadbio / Pando

Multiome GRN inference.
https://quadbio.github.io/Pando/
MIT License
110 stars 21 forks source link

Advice about a dataset with conditions #52

Open AmelZulji opened 9 months ago

AmelZulji commented 9 months ago

hi @joschif, congrats on the publication and thank you for porting the analysis in a package!

I would like to ask for an advice for analyzing a 10x multiome dataset (control and condition with multiple biological replicates) using Pando.

I want to analyze each cell type separately (which should be fine as mentioned in #49), but since im interested into differences between control and condition within the same cell type, im not sure what would be the best way to call infer_grn(). I was thinking about: infer_grn(genes = cell_type_specific_DEGs) and interpreting the network as condition specific, but im not sure if that makes sense?!

What do you think about it, do you have any advice? Is there any use case of pando in the experimental design similar to the one I have?

Would be very thankful for support! Amel

joschif commented 8 months ago

Hi @AmelZulji, when doing case-control comparisons we've usually run Pando on both conditions together and then tested for differential module activity or reg region accessibility. I personally would probably not constrain the GRN to DEGs, because a lot of important regulators will not be DE and so it might be hard to get a well-connected GRN

AmelZulji commented 6 months ago

Thank you for the answer, @joschif!

When it comes to module activity, do you first build module activity with AddModuleScore() as mentioned here: https://github.com/quadbio/scMultiome_analysis_vignette/blob/main/Tutorial.md#section-3-gene-regulatory-network-reconstruction.

If yes, what test would you use to test differentia module activity (I am having 2 groups with 5 biological replicate each)? Thank you in advance for help!

joschif commented 6 months ago

Hi @AmelZulji, yes, that AddModuleScore() is definitely one way to do this. You could then test for diff activity very simply with a t-test/ANOVA. Since you have replicates, I would probably suggest to use pseudobulk approaches.

AmelZulji commented 6 months ago

Thank you, @joschif! One last question: in order to avoid bias from number of cells, pseudobulk profiles should be generated using averages, or?