wheaton5 / souporcell

Clustering scRNAseq by genotypes
MIT License
153 stars 44 forks source link

Integrating Souporcell outputs from multiple pooled runs #90

Open EvoMedLab opened 3 years ago

EvoMedLab commented 3 years ago

I have 18 scRNA experiments that were pooled from five mouse strains. I have split them into clusters using Souporcell, but am unclear how to integrate these together into a single object. They all split into five clusters nicely, but combining and importing them as an object (Seurat or Monocle or other approach) is unclear. Is there a suggestion?

I see that there is no current mechanism to overlay cluster information using Seurat. Is there an update to this?

alyamahmoud commented 3 years ago

@AgaricX @wheaton5 is there a recommended method for integrating clusters from different experiments that have shared samples?

wheaton5 commented 3 years ago

There is a shared_samples.py script which can do this. It takes 2 experiments (souporcell output directories) and the number of shared samples between them and matches them up.

alyamahmoud commented 3 years ago

great! thanks for the prompt response, will give it a try thanks again

alyamahmoud commented 3 years ago

I might have asked the question in a bit wary way I have 9 scRNAseq samples, 3 /3 different phenotypes, I ran souporcell on each of the 9 samples so I end up with 9 souporcell directories of 3 phenotypes, does this make -n in the shared_samples.py 3? do I do the merge sequentially for the first two of each phenotype then with the third remaining one? thank you

wheaton5 commented 3 years ago

You got it exactly. Sorry I don't have a multi-way system. I'd just pick a base experiment (the one with the most data perhaps) and run this on all experiments vs that one with -n 3 and that should match everything up. Hope this helps.

wheaton5 commented 3 years ago

Well this is assuming I understood your setup. You have 9 experiments with 3 individuals each? And all 9 experiments have the same 3 individuals?

alyamahmoud commented 3 years ago

9 experiments with 3 individuals each each experiment has a different individual getting the same treatment

wheaton5 commented 3 years ago

Still don't understand for sure lol.

alyamahmoud commented 3 years ago

I will try the setup explained and come back to you if it seems weird :) thank you so much Alyaa

wheaton5 commented 3 years ago

the shared_samples.py is to match up the same individuals in different experiments. It is not to match up the same conditions in different experiments. Is that clear?

alyamahmoud commented 3 years ago

okk my next question would then be how can I interpret souporcell clusters across different individuals with the same conditions?

wheaton5 commented 3 years ago

So its experiment 1: individual1 condition a, individual 2 condition b, individual 3 condition c experiment 2: individual4 condition a, individual 5 condition b, individual 6 condition c etc.?

Do you have some strong marker genes that should differentiate each condition?

alyamahmoud commented 3 years ago

yes, exactly. yes, I do have marker genes that should differ between the conditions.

wheaton5 commented 3 years ago

In the future you may look to some interesting experimental design patterns which combine overlapping pooled samples with varying conditions. For example here is a 4 individual, 3 condition experimental design where everything can be matched up on genotypes and you control for genetic differences by having the same individual getting multiple different conditions (if these are treatments for instance).

image