NGMLR is a long-read mapper designed to align PacBio or Oxford Nanopore (standard and ultra-long) to a reference genome with a focus on reads that span structural variations
Hi, I run the ngmlr followed command --ngmlr -t 10 -r $ref -q $reads --rg-id Pse2Mcin --rg-sm Pse -o out.sam. After running done, I use the samtools--v1.9 to convert sam file to bam file, and sort the bam file. The command like this: samtools view -@ 10 -bS out.sam > out.bam && samtools sort -@ 10 -o out.sorted.bam -T tmp.ali out.bam. When I use the out.sorted.bam as the input of PBSV (SV caller), the log file shows BAM header: read group ID not found. So I check the orignal out.sam generated by the ngmlr, I find there are some weired infomation. First, I check whether the out.sam file contains the RG line or not, and the result is @RG ID:Pse2Mcin SM:Pse. So I think that's right. Second, I check whether all read alignments contains the RG:Z:Pse2Mcin or not. And the result is surprised to me, there are some line contain the RG:Z:Pse2Mcin, but other lines don't. I select two alignments as examples below.
Although I don't know what cause this case, I continue running PBSV to call structural variations by adding the option --sample which can override sample name tag from BAM read group. And it does work without any error, but I don't know this is correct or not.
I'm looking forward to your reply! Thank you!
Hi, I run the ngmlr followed command --
ngmlr -t 10 -r $ref -q $reads --rg-id Pse2Mcin --rg-sm Pse -o out.sam
. After running done, I use the samtools--v1.9 to convert sam file to bam file, and sort the bam file. The command like this:samtools view -@ 10 -bS out.sam > out.bam && samtools sort -@ 10 -o out.sorted.bam -T tmp.ali out.bam
. When I use theout.sorted.bam
as the input ofPBSV
(SV caller), the log file showsBAM header: read group ID not found
. So I check the orignalout.sam
generated by the ngmlr, I find there are some weired infomation. First, I check whether theout.sam
file contains the RG line or not, and the result is@RG ID:Pse2Mcin SM:Pse
. So I think that's right. Second, I check whether all read alignments contains theRG:Z:Pse2Mcin
or not. And the result is surprised to me, there are some line contain theRG:Z:Pse2Mcin
, but other lines don't. I select two alignments as examples below.Although I don't know what cause this case, I continue running
PBSV
to call structural variations by adding the option--sample
which can override sample name tag from BAM read group. And it does work without any error, but I don't know this is correct or not. I'm looking forward to your reply! Thank you!