COMBINE-lab / alevin-fry

🐟 🔬🦀 alevin-fry is an efficient and flexible tool for processing single-cell sequencing data, currently focused on single-cell transcriptomics and feature barcoding.
https://alevin-fry.readthedocs.io
BSD 3-Clause "New" or "Revised" License
169 stars 15 forks source link

Only getting DIR:false after collate of SPLIT-seq #111

Closed davidaknowles closed 1 year ago

davidaknowles commented 1 year ago

Hi - I'm trying to run alevin-fry on the 150k mouse SPLIT-seq data from the original SPLIT-seq paper, roughly following the tutorial here: https://combine-lab.github.io/alevin-fry-tutorials/2022/split-seq/ (but of course with a mouse only index). My current issue is that after I run generate-permit-list and collate I get a collated rad file that only contains DIR:false (i.e. antisense) alignments, whereas before collation I see a mix (and actually about 2x DIR:true i.e. sense). Is this expected?

rob-p commented 1 year ago

Hi @davidaknowles,

Thanks for filing the issue / raising this question. This is an effect of how the collate command works that, admittedly, is not well documented (we'll aim to fix that).

Basically, during collate, alevin-fry also filters out mappings that were not consistent with the required orientation specification. Since, in the filtered and collated RAD file, it is assumed that all reads have a "consistent" orientation, that information is no longer actively tracked, and the RAD file just marks the orientation of each mapping as FALSE. This behavior is somewhat of a historical artifact, but it's how the collate command currently works — since that information will no longer participate in quantification. If there is a good reason that one needs to retain that orientation information, however, we can work on supporting that and propagating the orientation to the collated RAD file.

Best, Rob

davidaknowles commented 1 year ago

Thanks for the rapid response! I think I can safely ignore orientation then.