Open wir963 opened 4 years ago
Thank you for your suggestion @wir963. This would certainly be a useful feature. If you are waiting for this capability, I think you could probably write a small pipeline to modify the current output:
PathSeqScoreSpark
PathSeq currently removes all read tags - I have received requests in the past to fix this. I'm not sure when I'll have a chance to address this, but I will keep the ticket open since it's currently the only feature request.
Thanks for your suggestion @mwalker174. I'd definitely like to do this soon so I'll implement that suggestion and update this thread with any questions.
Hey @mwalker174 ,
Do you have any suggestions about how to perform step 1? I naively tried to use picard's MergeBamAlignment
using the PathSeq output BAM as the aligned bam and the PathSeq input BAM as the unmapped BAM but I get the following error message
IllegalArgumentException: Do not use this function to merge dictionaries with different sequences in them. Sequences must be in the same order as well. Found [NZ_DS990135.1, NZ_AJSY01000035.1, ...
I tried sorting both BAM files by queryname and removing the alignment for the input BAM using RevertSam
but neither of these worked. I suspect that it's because of the PathSeq output BAM given the references to the microbial sequences. Do you have any suggestions?
FYI I'm just using pysam and doing the merge manually and it's working. I'll keep you posted
Feature request
Tool(s) or class(es) involved
PathSeq
Description
I would like the PathSeq scoring approach to be able to consider UMIs such as those used in scRNA sequencing experiments like 10x. UMIs are generally passed as a BAM tag so reads that share the same UMI should only be counted once.