AlexsLemonade / alsf-scpca

Management and analysis tools for ALSF Single-cell Pediatric Cancer Atlas data.
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Create genetic demultiplexing workflow #153

Closed jashapiro closed 2 years ago

jashapiro commented 2 years ago

With #152, the initial exploration of genetic demultiplexing (#127) could be considered mostly done, so I am creating a new issue for next steps. All of the components for genetic demultiplexing are there, but they are in separate workflows, with incomplete connections among them.

The task now is to join those all together so a single workflow can take as input the run data table and produce as output a set of files with all of the information needed for demultiplexing. Whether that would be a single SCE object with sample assignment as a part of colData, or separate SCE objects for each sample is not yet decided.

The steps are these:

  1. for a given multiplexed library, identify the bulk samples that correspond to the multiplexed sample set
  2. map relevant bulk samples with STAR
  3. jointly call SNPs with mpileup from the bulk mapping to create a VCF
  4. map single cell/nucleus samples with STARsolo
  5. use bulk VCF and mapped single cell results to call single cell SNPs & demultiplex
  6. (possibly) combine demultiplex info with previous quantification (including cellhash data)
jashapiro commented 2 years ago

closed by #160