GregoryFaust / samblaster

samblaster: a tool to mark duplicates and extract discordant and split reads from sam files.
MIT License
225 stars 30 forks source link

Question about "A read-id grouped SAM file" #27

Closed jiadong324 closed 7 years ago

jiadong324 commented 8 years ago

Hi all,

I have a SAM file which sorted by coordinate. In order to use samblaster, I use the following samtools command to sort the SAM file by read-id (QNAME)

samtools sort -n sample.bam sample.QNAME.sorted.bam

After I execute the command, the SO in @HD tag is queryname. But I still have problems of using samblaster. The problem is listed below:

samblaster: Can't find first and/or second of pair in sam block of length 1 for id: samblaster: At location: chr20:50456848 samblaster: Are you sure the input is sorted by read ids?

Really appreciate if you could provide any suggestions and looking forward to discuss.

Thanks.

GregoryFaust commented 8 years ago

Your sort command looks fine, but samblaster take SAM as input, not BAM. What version of samblaster were you using? If 0.1.23, then it will detect if you give it a BAM file instead of a SAM file. In this case, you probably do have unpaired reads in your input. You can find out by looking for the offending read and see if its mate is also present. Or, you can have samblaster ignore these errors by using the --ignoreUnmated option. If you do the latter, I recommend looking carefully at the statistics that samblaster outputs at the end of a run in order to see how many unpaired reads were found.

If you are trying to use a BAM file as input, that won't work, and you need to do something like this: samtools view -h sample.QNAME.sorted.bam | samblaster -i stdin ....

Greg

jiadong324 commented 8 years ago

@GregoryFaust

Thanks for helping, I do find these are unmated pairs in SAM file. I just simply remove these records from SAM. Your suggestion works on my original file.

Thanks again!