AlexsLemonade / alsf-scpca

Management and analysis tools for ALSF Single-cell Pediatric Cancer Atlas data.
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

CITE-seq workflo #120

Closed jashapiro closed 3 years ago

jashapiro commented 3 years ago

This PR adds a workflow to perform CITE-seq mapping and quantification with salmon alevin/alevin-fry. This workflow does not yet combine these data with the scRNAseq results; that will come in a separate PR, likely when incorporating these results into scpca-nf. For now, the goal is partly to have a set of output files that we can use to develop the steps required for creating SingleCellExperiment objects with both modalities of data.

This workflow is based largely on the existing alevin-fry workflow, but adds some elements and other changes.

I think we will be able to just combine RNA and CITE-seq by sample_id (using something similar to the .combine step here, but I will need to check that this works always. If it is not the case, we may need to add another column to the library info table which indicates which run_id contains the corresponding RNA for each CITE-seq or cell hash sample.

jashapiro commented 3 years ago

I had thought about it, but that step is so very fast (less than 5 seconds) that it hardly seems worth it. I have code in there so we are only doing it for the number of indexes that exist, so it isn't like there are a bunch of extra copies hanging around.

If we did publish it, we would presumably want to have some code to look for the published version, which might actually make things more complex!