wheaton5 / souporcell

Clustering scRNAseq by genotypes
MIT License
157 stars 45 forks source link

Sample Identification using Whole exome sequencing dataset #114

Open leeheetakat opened 3 years ago

leeheetakat commented 3 years ago

I've implemented souporcell for mixture of six-samples. and I have whole exome sequencing results for individual samples (i.e. six .fastq files). Then, how do I identify the assocations between souporcell clusters and samples?

  1. I think cluster_genotypes.vcf contains hint for this but, I don't understand GO:GN part (0/0, 0/1 or 1/0, 1/1 likelihood, but minial log likelihood was not selected as GT part) I want to know how to extract cluster specific variants.
  2. If we have cluster representing SNVs from 1, we may utilzed this info. to identify samples with WXS results.
Zethson commented 3 years ago

I would also be interested in a couple of pointers for this.

qindan2008 commented 2 years ago

I would also be interested in how to extract cluster specific variants.