jts / sga

de novo sequence assembler using string graphs
http://genome.cshlp.org/content/22/3/549
237 stars 82 forks source link

read and quality length is not the same after EC #107

Open starnight0 opened 8 years ago

starnight0 commented 8 years ago

It seems you forget to delete the qual when you delete a base pair at error correcting, which make read length and qual length different. For example:

after error correcting ATGAGCCCATGTGCAGTCAACGCCAATGACGCCATTGCATCGAGCGCAAAAGAATCGGTGAACTGTTTTCTTCAACTGGTTTATTGAATCGAACTGTCAGAAAGAACTAACGTTACTGGTCATCCGAAAACCCATGCAACCGGTTCCTGACTCGTGAACGAGTCATTATCTGGCTCGGCTCGGTGTTCATGTCTCTCTTCACAGCAGTTCAGTCAGCACGCCAGTCTCGCTTGATTCGCTCTATGATGT + DDDDDIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIHHIIIIIIIIIIIIIIIIIIHIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIHGIIIIIHEHIIIHIIIIIIIHHIIIIIIEHHHIIIHIIHIHIIIHIIIIGHIHFGHIIIIIHIIIIIIIIIIIIHIIGGHIDHHHHHH6G@=BFH.BCCBHHCGHHHI@A?@8G.FE.@H.6.-C,-,8F8@---@?@HE@HHHCEHHHHHHHHH [chenz11@khan default]$ sed -ne"/^@CRATOS:376:HC2NCBCXX:1:1101:3388:2482\/1/=" preprocess.fq 12441

pre error correcting @CRATOS:376:HC2NCBCXX:1:1101:3388:2482/1 ATGAGCCCATGTGCAGTCAACGCCAATGACGCCATTGCATCGAGCGCAAAAGAATCGGTGAACTGTTTTCTTCAACTGGTTTATTGAATCGAACTGTCAGAAAGAACTAACGTTACTGGTCATCCGAGAACCGATGCAACCGGTTCTTGACTCGTGAACGAGTCATTATCTGGCTCGGCTCGGTGTTCATCTCTCTCTTCACAGCAGTTCAGTCAGCACGCCCAGTCTCGCTTGATTCGCTCAATGATGT + DDDDDIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIHHIIIIIIIIIIIIIIIIIIHIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIHGIIIIIHEHIIIHIIIIIIIHHIIIIIIEHHHIIIHIIHIHIIIHIIIIGHIHFGHIIIIIHIIIIIIIIIIIIHIIGGHIDHHHHHH6G@=BFH.BCCBHHCGHHHI@A?@8G.FE.@H.6.-C,-,8F8@---@?@HE@HHHCEHHHHHHHHH

Zelin