lh3 / bwa

Burrow-Wheeler Aligner for short-read alignment (see minimap2 for long-read alignment)
GNU General Public License v3.0
1.54k stars 556 forks source link

paired end read name problems #290

Open dpellow opened 4 years ago

dpellow commented 4 years ago

When using the latest version (0.7.17) of bwa mem it seems like paired-end reads (split in 2 files) must have names like read1/1 and read1/2 (using the "/" character), while an older version I used (0.7.5) also allows names like read1_1 and read1_2 or read1.1 and read1.2 .

What is the most up to date version that supports these different naming conventions? Is there a way to prevent bwa mem from erroring out on reads that use this naming convention?

jingydz commented 1 year ago

I also meet this error.

less SRR19880797_sort.1.fastp.fastq.gz (参考)@SRR19880797.5023185 5023185/1 CTGTGGCCCTGTGCCAAACCTGGAGCAGCTGCCTTTAGAGGCCAGGAGGGCTACTTCCCGTTTCCTGAGCACTGTCCCTCTGTCTGCAGGAGTGCTGCTG + FF@FFFFFFFFGFFFFFFFGAFFEFFFFFFGFFFFGCFFGFEFGGF>GGFFFFGGFFGFFFGGBGGFGGGFFGGFFGFFFBGFGFFFFFDFFFGFC (报错)@SRR19880797.5023186 5023186/1 CTGGGAAGAAGCACAGACCACCAGGCCCCCTGTTCTCCTCCTCAGATCCCCTTCCTGCCACCTCTTCCCATTCCCAGGACTCAGCCCAGGTCACCTCGCT + GGFFFFFFFFFFGFFGEFFGGFFFGGFFFFFGGG>FFGFGEGGEEFGFGGFGGFFGGGFEFGFFEFFGFGFGFGGFFFFFFD>EFFGGDFFGG@GFA;FF (参考)@SRR19880797.5023187 5023187/1 AGGACACGGTACAAAAGGGCAGCCAGGCAGGGTTGGAAGGTGGGGTCTGAGGGGTTTCCACCTGCCCTCTCCCATCCTTCCAGGTTTTGGCGGCAGATGG + F?FFFFGFF/FGFGFF>FFFFFFFFFFFFFFFFFFFFEFFFFEFFD@FFCFFEFFFFGGFFFFDFDFFFGFFFFFFFFFEFFGFBFGFFECFF:DFBFFF

less SRR19880797_sort.2.fastp.fastq.gz (参考)@SRR19880797.5023185 5023185/2 GGCTGGCCCAGCGCCAGCGTCGGAGCGCCGGCCCCCTCCCCGGGCCGCCCCCACCCAACCAGACCCTCCAGCGCGTGCCACCGGACCTCGTGTCCTAGAC + );7@CDB1B:AA=DCB3AE;D?C:5=61?46?B@BE;;+A9@19+97&&>A'@4;)9&8?>&8E3>76='(BB,>&<&2EC'4;?=9.4>+5 (报错)@SRR19880797.5023186 5023186/2 TCCTTGAACACAGCAGGGTTGGAGGCCATGAGGCTCTGGGCCTCCGTGAAGCTGAGCTGCACAGGGTAGTAGCCGCCATTGAACGGGTTGTGGCAGGATG + FFFFFGDFFGFFEFFFFFFEFF@FFFFFEFFFFFFFFFFFFFGFDGFGF;FFGGEGFFGEFFFFFFFF>FFFFFFEFFFFGFFDFFG@FF<FFFDF=@FG (参考)@SRR19880797.5023187 5023187/2 AAATTCCACAAGAGGGTCATTAAGTGTGATAGTGGAAATGCCCTAACCTCCACCCTTACTTCTCAAATATTCTAGCTATTGGAGATAAAGTACCATATAC + GFFFFFGFF?FGFFFFFFFGFGFGFGFFFFFGFGFFFFFGFFFGFFFFFF>GFFFFFFFFFFFGGFFFGFFFFCEFFGGFGFFFFFFFFFFFFGFGGFFF

SRR19880797检查 [mem_sam_pe] paired reads have different names: "SRR19880797.5358018", "SRR19880797.10839728" [E::sam_parse1] CIGAR and query sequence are of different length [W::sam_read1] Parse error at line 9982028 [main_samview] truncated file. Mapping failed

samtools view -h ./SRR19880797/SRR19880797.bam |less +9982028 -SN

(参考)3370行 SRR19880797.1 65 chr8 143932417 60 100M chr22 20819568 0 TGGCGGTCATGTTGGTGTTGCGGTCGCTCCAGTCGAAGCCCACCTCCTCCTCCTCCTTCTCATTCAGCCACATTAGCTCCTTAGTGGCGGTTGCCACAAA FFFGFCFFGFFFGFFFFFFFFFFFGGFFGFFFEFFDFGFFFFFFGFFFFFGFGGFGGFF@GGGFGFFFFFBFFGAFFGFFFGGDAGFGDG+@D9@/?=59 NM:i:1 MD:Z:90C9 AS:i:95 XS:i:23 RG:Z:SRR19880797 (参考)3371行 SRR19880797.1 129 chr22 20819568 60 100M chr8 143932417 0 AGAGGGATTTTCTTCGCAGGGGAGCTTAACAGGGTCTTTCTCCTCTGCTCTTTCCCCAGTAGCCCAGGCCCACCTGAGAGATGCTGGACACACTGCTGGT GFDFFFFFF;FFF9FFFFGFFFBFFFFGFFFFFFFFFEFFFFFFFFFFFFFFFFFFFFFFEFFFFFFFFF>FFFGFFFEFFFFFFFF@FFDFFFFEECG: NM:i:0 MD:Z:100 AS:i:100 XS:i:20 RG:Z:SRR19880797

(报错前一行)9982027行 SRR19880797.5023186 81 chr22 22643043 0 100M chr3 126500710 0 AGCGAGGTGACCTGGGCTGAGTCCTGGGAATGGGAAGAGGTGGCAGGAAGGGGATCTGAGGAGGAGAACAGGGGGCCTGGTGGTCTGTGCTTCTTCCCAG FF;AFG@GGFFDGGFFE>DFFFFFFGGFGFGFGFFEFFGFEFGGGFFGGFGGFGFEEGGEGFGFF>GGGFFFFFGGFFFGGFFEGFFGFFFFFFFFFFGG NM:i:0 MD:Z:100 AS:i:100 XS:i:100 RG:Z:SRR19880797 (报错行)9982028行 SRR19880797.5023186 161 chr3 126500710 60 100M chr22 22643043 0 TCCTTGAACACAGCAGGGTTGGAGGCCATGAGGCTCTGGGCCTCCGTGAAGCTGAGCTGCACAGGGTAGTAGCCGCCATTGAACGGGTTGTGGCAGGATG FFFFFGDFFGFFEFFFFFFEFF@FFFFFEFFFFFFFFFFFFFGFDGFGF;FFGGEGFFGEFFFFFFFF>FFFFFFEFFFFGFFDFFG@FF<FFFDF=@FG NM:i:0 MD:Z:100 AS:i:100 XS:i:0 RG:Z:SRR19880797