GregoryFaust / samblaster

samblaster: a tool to mark duplicates and extract discordant and split reads from sam files.
MIT License
225 stars 30 forks source link

unmated reads #50

Closed ghost closed 3 years ago

ghost commented 3 years ago

Greetings,

I see that samblaster has an --ignoreUnmated flag. It seems to require that the bam file be sorted by query name. However, it seems like it also ought to work with a bam file resulting from

bwa mem ... | samtools view -b > blah.bam

From quickly looking at the source code, it seems like using the --ignoreUnmated flag on blah.bam should work since their query names are together.

What do you think? It would be nice to skip the samtools sort -n blah.bam step.

GregoryFaust commented 3 years ago

If you read the README carefully, it says that the input file must be read-id grouped not necessarily read-id sorted. It also shows that the typical usage scenario for samblaster is indeed in a pipe right after the aligner as you suggest. This is true with paired reads or singletons. No sort is needed if used in such a pipe.