stjude / XenoCP

A cloud-based tool for mouse read cleansing in xenograft samples
Apache License 2.0
5 stars 3 forks source link

Bug: unmapped read pairs are not preserved in output #9

Closed mcrusch closed 3 years ago

mcrusch commented 5 years ago

Unmapped read pairs (with empty reference name) are not being included in the final BAM. Presumably, this was introduced with the FASTQ extraction optimization, since no-reference reads are not extracted there.

adthrasher commented 5 years ago

My suggestion would be to use sambamba to pull the reads during the splitting step, e.g. sambamba view -F "ref_name =~ /\*/ and unmapped and mate_is_unmapped". Then just pass an additional bam to the merge step at the end.

adthrasher commented 3 years ago

Closing now that #10 is merged.