AlexsLemonade / alsf-scpca

Management and analysis tools for ALSF Single-cell Pediatric Cancer Atlas data.
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Modify alevin-fry workflow to run spatial transcriptomics libraries using unfiltered mode #158

Closed allyhawkins closed 2 years ago

allyhawkins commented 2 years ago

As discussed in https://github.com/AlexsLemonade/alsf-scpca/pull/151#pullrequestreview-825323348, we should try and use Alevin-fry with the unfiltered mode rather than the knee mode for ST libraries and see if that affects some of the results we are seeing in benchmarking.

For some more context, this was originally posted by @jashapiro in https://github.com/AlexsLemonade/alsf-scpca/pull/151#pullrequestreview-825323348:

I just had a thought about how we might see if that is a likely problem. If the AF knee method is keeping some spot barcodes that do not correspond to real spots, we should see those in the set of barcodes that are removed when joining with spot coordinates. If there are barcodes that are being removed that have "surprisingly high" UMI counts, that might indicate that this is part of the difference. I don't think this needs to be part of this PR, but I think we might want to have a look before deciding that AF should not be used. (While it seems we do have to run spaceranger no matter what, having AF results for comparison to the single cell data seems like it still might be worthwhile).

The first step in doing this should be modifying the current workflow to allow for the unfiltered mode to work with ST libraries in alevin-fry. As a part of this, we will need to grab the barcode files for visium from spaceranger-1.3.1/lib/python/cellranger/barcodes/ and add them to the barcodes folder on s3, s3://nextflow-ccdl-data/reference/10X/barcodes. The barcode files then need to be added to the list of barcode files in the actual workflow.

Then we need to change the technology in the metadata to reflect the version of visium that was used, according to the spaceranger output for SCPCR000372 and SCPCR000373 (the samples being used for benchmarking), the version is visium_v1. Changing the technology is important to make sure we are using the correct barcode file.