rvaser / spoa

SIMD partial order alignment tool/library
MIT License
158 stars 32 forks source link

Convex gap penalties #16

Closed ksahlin closed 5 years ago

ksahlin commented 5 years ago

Hi again,

I was wondering if it would be possible to explore convex gap penalties within spoa. To give a concrete example: penalize a gap as log(gap_size), meaning that it would be cheap to extend gaps to favor longer insertions than shorter, the figure attached below illustrates this.

While this might be nontrivial to implement, it's definitely needed by the community expecting structural gaps in their alignments (transcripts or structural variants). I believe work in this direction will be immensely helpful and publishable. If you don't have time looking into this, I would appreciate your opinion as to whether this is possible to implement within spoa?

This idea has been investigated by regular aligners NGMLR, but would be, to my knowledge, novel in a POA alignment strategy.

image (Image taken from https://medium.com/pacbio/visualizing-the-chaos-of-cancer-one-tool-at-a-time-a9e083f8bc31)

rvaser commented 5 years ago

I'll look into it.

rvaser commented 5 years ago

I thought about it and the easiest way to implement convex gaps is like minimap2, i.e. two affine functions. Will that suffice?

ksahlin commented 5 years ago

That would be great!..and yes, I think that would suffice.

I did some pairwise alignment testing with minimap2 vs regular affine alignment. My experiments indicate that minimap2 handles the long gaps better with this two step approach. Therefore, I'm hopeful that it will work just as well in a spoa setting!

rvaser commented 5 years ago

@ksahlin do you maybe know which alignment parameters were used in the above figures? Or do you maybe have some of your examples which will help me check convex gaps?

ksahlin commented 5 years ago

Unfortunately I don't know the parameters for that plot, or even the exact convex function used.

Here is a reference of 6 fake exons, each of 50bp.

>ref
GGGGTCAGATGCCCTGTAATGAGCCACAGAAACTTGGGCCCATGGGTAGGTTCCAGGAGAGAGGGGCCTGGAGGGGTCCTCAGCCCTGGGGGATTGGGGTGTCAAGCAACTTCTCTCTCCAGGCTCAGTCCTGCGGTCTGTGGGGAGACCTTCCTGTGGGCGCAGCTGGAGTCAAGGCTTGGGGTCTTGGGGTATGCTTCGCAGACAAAGCAGCTGTGCCAGTCTCCGAGTTCCTGGGACTCTGCCAGATCCAGGGCATCCTGAGCGGGCCCGGCTGGGGTGGGGATGGGGTCCGAGGGC

Below are 100 simulated reads with an error rate of about 16%, mostly deletions. In the reads, exon 2 and 5 are deleted in about 90% of the reads, the remaining 4 exons occurs in all reads.

Only one read has all 6 exons (read 19, the first one below), about 10 reads each has either exon 2 or exon 5. The rest of the reads (about 80) has only the four exons 1,3,4,6. The quality values are incorrect and should be ignored here.

@19_71.68061635887801
GGGTCGATCCCCTGAATGAGCCACAGAAGACTTGGGCCATGGGAGGTCCAGGAGTAGGGGGGCCTGGGGGCGTCCTCAGCCCTGGGGATGGTGGTGTCAAGCAACTTCTCTCCTCCAGGCTCAGTCCTGCGGTTCTGGTGGGGGAGACTTCCTGTGGCGCAGCTGAGCTCAAAGGCTTGGGGTCTGGGGTATGCTTGCAGAAAACACTCAGTGCCAGTCTCCGCAGTCTGGGACTCCTCCAGATCAGGGCATCCTGAGCGGGCCCGGCTGGTGGTGGGGAGGGGTCCCCGAGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@61_61.51315304556198
GGGGGTCAGATGCCTGTAATGAGCCACAGGAAAAAATTGGGCCCATGGGTAGGCGTCAAGCAACGTTCTCTCTTCCAGGCTCAGCTGCTGGTCTGTGGGGAGACCTTCCTGGGCCGCAGCCTGAGTCAAGGCTTAGGGGGTCTTGGGGTATGCTTCGAGACAAAGAGCTGTGCACAGTTCTCTGAAGTTCTTGGGACTTGCCAGATCCAGGGCTCCTGAGCGGGCCGGCTGGAGTGGGGAGTTGGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@60_58.71710063440007
GGGGTCAGATGACCCTGTAATGAGCCAAGAACTGGGCCCATAGGTGAGGGTCAAGAGACTTCTCCTCCAGGCTCATGTCCTGGGTCTGTGGGGAGACCTTCCTGGGTGCGCAGCTGGAGTACAAGGCTTGGGAGTCTTGGGGTATGTTCCGCAGACAAAGCAGCTCCAGTCTCCGAGTTCCTGGCTCTGCCAGATCGGGATCCGTGAGCGGACCCGGCTGGGGTGGGGAGGGGTCCAGGTGAC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@93_57.954540885901366
GGTCAGATGCCCTGTAAGAGCCACAGAAACTTGGGCCCAATGGTAGTGTTCCAGGAGAGAGAGGGCCTGGAGGGTCCTCGCCCTGGGGATTGGGCGTGTCAAGAACTTCCTTCAGGCTCAGTCCTCGGACTGTGGGGAGACCTTCCTGTGGGCGCAGCAGTACAAGGGCTGGGGTCTTGGGGTATGCTCCCAGGCATCCCTGAGCCGGGCCCGGCTTGGGGGTGGGATGGGGCTCCGAGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@29_57.446167720235565
GGGGTCAGATGCCCTGTATAGCACAGGAAACTTGGTCACCATGTAGGTCAACGAACCTTCTCTCCAGGCTCAGGTCCATGCGGTACTGTGGGGAACCTTCTTGTGGGCTGCCAGCCTGGAGTCAAGGCTTGGGGTCTGGGGATGCTTCGCCAAGCAGCGTGAGTCTCGAGTTCTGGGACGTTGCCAGATCAGGGCATCCTGAGCGCGGCCCCGCTGGGGTGGGGATGGGGTCCGAGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@48_57.446167720235565
GGGGTCAGATGCTGTAATGACCACAAACTGGCCCAGGGTAGGGTCAAGCCAACTTCTCCGTCCAGCTCAGTCCTGCGGTTTGGGGAGGACCTCCTGTGGCGCGCTAGGAGTCAAGGTTAGGGGTCTTGGGGTATGCTTCGCAGACAAAGCACGTGCTCAGTTCCCGAGTTCCTGGGACTCGTGCCAGATTCCAGGGCATCCCTGAGGCCCGGCTGGGGTGGAGGATGGGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@58_56.93779455456976
GGGGGTCAGATGCCCTGTAATGAGCCACAGAAACTTGGGCCCATTGGGAGGTCAACAGACTTCTCTCTCCAGGCTGTCCTGCGTCTGTGGGAGACCTTCCTGGGGCCCGCAGCTGAGTCAAGGTTGGTTTGGGGTTGCCTTCGCAGACAAGAGCAGCTTCCCAAGTCTCGGTTCCTGGGACTCTGCAATCCAGGGCATCCTGAAGCGGGCCGGCGGCGTGGGATGGGGTCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@30_56.68360797173687
GTGGGTCAGTGCCCTGAATGTAGCCAACAGAAACTGGGCCCATGGGTAGGTTCAGCGAGAGGGCCTGGAGGGTCCTACCCTGGGGGATTGGGGTGCTCAGGCATACTTCTCTCTCCAGGCTCAGTTCCTGGGATCTCGTGAGGGAGACCTATCTGTGGCGAGCGGGTCACAGCGCTTGGGGTCATTGTATGCTTCCAGGCATCCTGAGGGGCCGGTGGTGGGTGGGGTCCAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@70_56.68360797173687
GGGAGAGCTGTATAGCTCACAGAAACTCTGGCCATGGGTAGGTTTCCAGGAGAGAGGGGCCTGGAGGGGTCCTCGCCTAGGGGGATTGGGTGTCAAGCAACTTCTCTCTCCGCTCATCCTGCGGTCATGTGGCGGAGACCTCTGTGTGGCGCAGCTGGAGTCAAGCTTGGGGTCTGGGGTATGTTCCGCAGGGCATCCCTGAGCGGGCCCGGCTGGGTGGGATGGGGTCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@18_56.42942138890397
GGGGTAGATGCCGTAATGAGCACAGAAACTTGGCCCATGGGGTAGGTCAAGCACTCTCTCCCGGCTCAGTCCTGCGGTCGTGGGAGCCTTCCTGTGGGACGCGCTGGATCAAGGCGTTGGTCGTTGGGGTATGCTTCGCATGACATAGACAGCTGGCCAGTCTCCGAGTCCTGGGACTCGTCGCCAGATCCAGGCATCCTGCGGGCCCGGCTGGGGGGGATGGGGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@69_56.42942138890397
GGGGCAGATGCCTGTTAATGAGCCACAGAAACTTGGCCCATGGTGGTCAGAGAGAGGGGCCTGGATGGGGTTCCCAGCCTGGGGGATTCGGGTGTCAAGCACTTCTCTTTCCAGGGCTCAGTCTGCGTTGTGGGCCTCCTGTGGGCGCAGGCTGGAGTCAAGGCTTGGGGTCCTTGGTATCGCTCCGCAGGGGCATCCTGAGCGGCCCGGCGGGTGGAGAGGGGTCCGAGGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@1_56.17523480607104
GGAGGCGATGCCCTGAATGAGCCACAGACCTTGGCGCCATGGGTAGGTTTCCAGGAGAGAGGGCCTGAGGGGATCCAGCCTGGTGGGATTGTGCAGCACTTCTTTCCAGGTCTCAGTCCTGCGGTCTGTTGGGGAAGACCTTCCTGAGTTGGGCGCGTCGGAGTCAAGGCTGGGTCTTGGCGATGCTTCCCAGGCATCCTGAGCGGGCCGGCTGGGGTGGGATGGGGTCAGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@21_56.17523480607104
GGGGGTCGATGCCCCTGTATGAGCCACAGAACTTGGCCATGGGTAGGGTTCCGGAGAGAGCGGGCCGAGGGTCCTCAGCCCTGGGGATTGGGTTCAAGCCCAACTTCTCTTCCAGGCCTCAGTCCTGGTCTGTGGGGACTTCCTGTGGGGCACGGGTCAAGGGCTTGGGCGTCTGGGGTATGCTTCCCAGGCTCCATGAGCAGCCCGGCTGGGGTGGGATGGGGTTCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@91_56.17523480607104
GGTCAGATGCCCTTGTATGACCACAGAAACTTGGGCCCAGGGGTAGGGTCAAGCACTTCTCTCTCTCAGCTCGTCCTGCGTCTGTGGGGAGACCTTCTGTGGCGCAGCGGAGTCAAGGCTTGGGGTCTTGGGGTATGCTTCGCAGACAAAGAGCTGTCCATGCTCCGAGTTCCTGGGGTCGCCAGATCCAGGGCATCTAGAGCGGCCCGCTGGGTGGGGATGGGGTCGGGGCC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@36_55.92104822323817
GGTCAGATGCCCTGGTAAGACGCCACAGAAAACTTGGCCTCGGGTAGGTTCAGGAGAGGGGCCTTGGGGGGTCTCAGCCCTGGGGTTGGGGTGTCAAGCAACTCTCCATGGCATCAGTCCTGGGTCTGTGCGGGGACCTTCCTGTGGCGCAGCGGAGTCAGGCTTGGGTCTTGGGGTTATGTCATTCCCACGGGCATCTGAGGGGCCCGGCTGGGGGGGGAGGGTCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@50_55.92104822323817
GGGGTCGATGCCCTGAATGACCACAGAAACTGGCCCAGGGTAGGTCAGGAGAGAGGGGCCTGGAGGGGCTCAGCCCTGGGGGAATTGGGGTTCAGAACTTCTGCTCTCCAAGGCTCAGTCCTGCGGCTGTGGGGAGACTTCTGTGGGCGGGACTGGAGTCAGGCTTGGGGTCTTGGGGTATGCTCCCAGGGCATCCTGAGCGGGCGCCGGCTGGGCTGGGATGGGTCCGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@52_55.666861640405266
GGGGTCAGATGCCTTATGAGCCAGAAACTGGGCCCTGGGTAGGTCCAGGAGGAGGGGCCTGGAGGGGCTCAGGCCTGGGGGATGGGGTGTGCAAAACCTTCTTCTCCGGCTCAGTCCTGCGTCTGTGGGGACCTTCCTGTGGGTCGCAGCTGGAGTCAAAGCTTGGGGGTCCAGGGGTATGCTCCCAGGGCATCCCTGAACGCGCCCGGCTGGGGGGGATGGGTCCAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@33_54.90430189190656
GGGTGTCAGATGCCCATGAATGACACAGAAACTTGGGCCCATGTGGTAGGTCAGGAGAGAGGGGCCTGTAGGGTCCTAGCCCGCGGAGGTTGGGGTGTCAAGCAACTTCTCTCTCAGGCTCGTCTGCGGCTGGGGGAGACTTCTGTGGCGCACTGGGTCAAGGGCTTAGGGTCTTGGGTATGCCCCAGGGCATCCTGAGCGGGCCCGGCGGGGATGGGGTCCTGAGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@42_54.39592872624076
GGGGTTCAGATGCCCTGTAGAGGCCACAAGAAACTTGGGCCATGGGTAGGTCCGAGAGGGGCATGGAGGTCTCGCCCTGGGGGTTGGTGTCAAGCACTTCTCTCTCCAGGCTCAGTCCTCGGTCTGTGGGGAGACCTTCCCTGTGGCCAGTGGGTCGCTTGGGGCTTGGGTATGCTTCCCAGGGGCAATCCTGAGCGGCCGCGGTGGGGTGAGGAGGGTCCGAGGG
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@73_53.88755556057496
GGGTCAAGATGCCCTGTATGCCAAGAAACTTGCCTGAGTAGGGTCAAGCAACTTCTCTCCCAGGCTCATCCTGCGCTTGTGGGGAACCTTCCTTTGTGGGCGGCTGAGTCAAGGCTTGGGGTCTGGGTATGCCTCAGACAAAGCGCTGTGCCAGCTCCGATCCTGGGACTCTGCAACCAGGGCATCCTGAGCGGGCCCCGCTGGGGGGGATGGGTCCGAGGTGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@41_53.37918239490916
GGGGTAGAAGCCTGAATGAGCCCAGAACTTGGGCCCTGGGAGGTCAACATACTTCTTCCCAGTCAGTCCCGAGTCTGCTGGGGGACTTCTGTGGCAGCGCGGAGTCGGCTGGGTCTGGGGTATGCTTCCAACAAAGACAGGCTGTGCAGTCTCGAGGTCCTGGGACCTGCAGATCCAGGGCATCTGGCGGCCGGCTGGGGTGGTGATCGGGGTGCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@5_48.54963732108404
GGGGGTCAGATGCCTGTAATGAGCCACATGAAACTTAGGGCCCATCGTGGTAGGGTCAGACAACTCTCTCTCCAGGCTCTCAGTCCTGCGTGTCTGTGGGGAGACCTGTCCTGTGGGCGCAGCATGATTCAAGGGCTTGGGTCTTGGGGTATCTCCAGGGACATCTGAGGGGCCCGGACTGGGGTGCGGGTGGGGTCCGAGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@74_48.04126415541822
GGGTCAGAATGCCCGTAGATGAGCCCACAGAACTTGGGCCATGGGTAGGGTAACAAGCAACTTCTCTCCTCAGGCTCCTAGCCTGCTCTGTGGGCGGACCTCCTGTTGGGCGCAGCTTGGAGTCAAGGCTTGGGGTCATTGGGGTATGCTTCCTAGGCATCCCGAGCGGGCCCGGCTGGGGTGGGCGATGGGTCCCGAGGG
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@90_47.78707757258534
GGGGTAGATGCCCGTGTGAATGAGCACAGAAACTTGGGCCCATTGGGCTAGGGTACTAGCAAACTTCTCTCTCGCAGCTCTAGTCCGCGGTCGTGTGGGGCAGACACTTCCTGGGTGCGCAGCTGGAGTCAGTTGGGTTTGGGGTATCGCTCCAGGGCATCTGGGCGGGCCGGCTGGGGGGGGGGATGGGGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@9_47.53289098975244
GGGGTTTCAGTGCCCTGTAATTGAGCCACAGAACTTGGGCCATGGGTAGGGCTCAAGAACTTCTGCCTCCAGGCTCAGGTCCTGGGTCGTGGAATGGGAGACCTTCCTGTGGGCGCAGCTGGAGTCAAAGGTTTGGGGTTTGGGGTATGCTTCCCAGGGGCACCTGAGCGGGCCCGGCGGGGTGGGGATGGGGTCAGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@26_47.27870440691954
GGGGTAGATGCTCTTAATGAGCCACAGAACCTATTGGGCCATGGGTGGGTCAAGCAAACTTCTTCCATGCTCAGTCCTGCGTCTGTGGGAGACCTTCCTGTGGGGCGCATTGCTGGAGTCAAGGTTGGGGGTCCTTGGGGTACTGCTTCCCAGGGCGATCCTGAGGAGGGCCCGGCTGGGGTGGGATGGTCCGAGGTG
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@43_47.27870440691954
GGGTCAGATTGCTCCTGTATTAGCCACAGAAATGGGCCCATGGGTAGGGGTCAAGCAACTTCTCTTCTCCACGGCTCAGTCCTGCGGTTGTGGGGGACCTTCCTTGTGGCGCGCTGGACGTCAAGGCTTGGGGTCTTGGGGTATGCTTCCAGGGATCCTGTAGCGGGCCCGGCTGAGGGGGCGGTGGGCCGAGTGCTG
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@81_47.27870440691954
GGGGCTCAGATGCCTGTAATGAGCCAATGAATCTCGGCCCAATGGGTAGGTCAAGCATATTCTCTCTCCAGGCCAGTCCGCGGTCTGTGGGGAGCCTTCCTTGGGCGCAGCGTGAGTTCAAGGCTTGGGGTCTTGGGTGTAATGCTTCCCAGGGCATCCTGTAGCGGGCTCCGCTGGGCGTGGTGGATGGGGTCCGGG
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@15_47.024517824086615
GGGGTCATGCCTGTAATGTAGCCAACAGACAACTTGGGCCAATGCGGTAGGGTAGCAACGTTCTCTCTCCAGGGCCAGTCTGCGCTGGTGGGGAAGACGCTTCCTGTGGGCGCAGCTGGAGTCAAAGGCCTTGGGGTCTGGGGGTATGCTTCCCAGGGCATCTCTGACGGCCGCTGGGTGGGGATGGGGTCCAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@31_47.024517824086615
GGGTCAGGATCCCTGTAATGAGCCACAGAAACTTGGCGCCAGTAGGTCAAGCTAACTATCATCTCTCAGGCTCACTGGCGGTCTGTGGGGAGACCTTCCCTGTGGGGCGAGCTGGGTCAAGGCTTGGGGTTCTTGGGGGTATGCTTCCCAGGGCATCCATGAGCGGGCCCGAGCTGGGGTGGGATGGCTCCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@39_47.024517824086615
GGGGTCAGATTGCCGCTGAATGAGCCACAGAACTTGGGCCCTGGTAGGTCAAGCAACTCTCTCTCAGGCCGTCGAGTCCTGCGGTTGTGCGGGCGACTTCCGGGGCCGCCAGCTGGAGTCAAGGCTTGGTGGTCTTGGGGTATGCTTCCCAGGGCTCCTTGAGCGGCCCGGCTGGGGTGGGGGATGGGGTCCGGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@40_47.024517824086615
GGGGTCAGATGCCTGTAATGAGCCACAGAAACCTTAGGGCCATGGGTAGGTCAGCATTCTCTCTCCAGGCTAGTCCTGCGGTCTAGGTGGGTAGAACCTCGCGTGGAGGCGCAGCTTGGGTCAGGCTTGGGGTCTTGGGGTATGCTTCCCAGGCGCACCTGAGCGGGCCGGTCGGGGTGGGGATGGGGTCCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@63_47.024517824086615
GGTTCAGATGGCCTGTAATGAGCCAGCGAGAAACTTGGGGCCCATGGGACGGGTCAACAACTTCTTCTCCGGCATCAGTCTGCGGTTGTGGGGAGACCTTCCTGTGGGCGCAGCTGGAGTCAAGGCTTGGGGTCTGGGGTATGCTTTCCAGGGCATCCTGAGCGGGCCCGCTGGGGTGGGGTATGGGGTCCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@56_46.770331241253736
GGGTGCAGAGCCCATGAGCCAGAAACTTGCGCCAGGGTAAGGGTCAAGGCAATTCTTCTCCCAGGCTCAGTCCTGCGTCTGTGGGGAGACCTTCCTGGTGGGCGACGCTTGGAGGTCAAGGGCTTGGGGCTTTGTGGGTATGCCCCAGGGACATCCGAGCGGGCCCCTGGCTGGGGTGGGGATTGGGGTCCAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@83_46.770331241253736
GGGTGATGCCCTGTAATGAGCCCAAGAAATTGGCCCATGGGTAGGGTCCAAGCATCTCTCTCTCCAAGGCTCACGTCCTGGCGGTCTGGGGCGAACCTTCCTGTGGGCGCAGCTGGAGCTCAAGGGCTGGGGTCTTGGGTATGCTCCCAGGGCATTCCTGAGCGGCCCGGCTGGGGTGTGGATGGAGGTCGAGGGG
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@11_46.516144658420835
GGGGTAGATGCCCTGTAATGACCAAGAAACTTGGGCATGGGTAGGGTCAAGCAACGTTCTCTCTCCCAGGCTCAGGTCCGCGGTCTGTGGAGACCTTCCTGTGTGCGCAAGCTGGAGCCTAAGCTTGGGGTCTTGGGTATGCTTCCCAGAGGCATCCTGAGCGGGCCGGCGGGCGTGGGGATGGGGTCCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@62_46.516144658420835
GGGGTCAGCTGCCCTGTAATGAGCCACAGAAACTTGTGGCCCATGTGGTAGGGTCAAGCCAACTTCTCTCTCCAGGCTCAGTCCGGTTCTGTGGGGAGACCTTCCTTGGGCGCAGGTGGAGTCAAGGCTGGGGCTTGGGGGTATGACTTCTCCAGGCATCCTGACCGGTGCCCGGCTGGTGGGGTGGTCCGGAGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@82_46.516144658420835
GGGTGTCAGTGCCCTGTAATGAGCACAGAAACTTGGGCCCTGGTAGGGTCAACAACTTCTCTCTCAGGCTCCAGTCCTGCGGTCTGTGGGGAGACCTCCTTCGGGCCAGCTGGAGTCGTAAGGCTGGGGGTCGTTGGGGTTGTTCCTCAGGGACATCCTGCGCGGGCCCGGCCTGGGGTGGATGGGGTCCGGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@2_46.261958075587934
GGGTCAGAAAGCCCTGTAGAGCCACAAAAACTTGGGCCTGGGTAGGGTCAAGGCAACTTCCTCTCTTCCAGGCTCAGTCCTGCGGCTGTGGGGGACCTTGCCTGTGGGCGCAGCTGGAGGTCAAGGCTTGGGTTTGGGTATGCTTCCCAGGGCATCCTGAGCGGGCCCGGCTGGGGTGGGGATGGGGTCGGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@6_46.261958075587934
GGGGTCAGATGGCCCTGGTATGAGCCACAGAACACTTGGGCCCCGATGGGTAGGTCAAGCACTTCCCCTCCCGGCTCAGCCTCGGGTCTGTGGGAGACCTTCCTGTGGGCGCAGCTGGTCAAGGCCTGGTTTGGGGTATGACTATCGCCAGGGCACCTGAGCGGGCCCGGCTGGGTGGTGGGGATCCGGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@17_46.261958075587934
GGGGCTCAGATGCGCCTGTATTGGAGCCACAGATAACTTGGGCCCATTGGTAGGTCAAGCAACGTCTCCCGGCTCATCCTCGTGTCCTGTGGGGTACGACCGTTCCGTGGGCCAGCTGGTCAAGGCTTGGGTCTTGGGGTATGCTGTCCCAGGGCATCCTGGCGGGCCGGCTGGGTAGGGGATGGGGCCAGAGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@47_46.261958075587934
GGGTCAGTGCCTGTATGACAAGAGAAACTGGGCCCATGGGTAGGGTCAAGCAACTTCTCTCTCCAGGCTCAGCCTGGCGAGTCTGTGGGGAGACCTTCCTGTGGCGCGCTGGAGCGAAGGTCGTTGGGGTCTTGGGGTAATGCTTCCAGGCATCCTGAGCAGGGCCCGGGCGGGGTGGGATGGGTCCGAGGGAC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@76_46.261958075587934
GGGTAGATGCCCTGTAAAGTTGAGCACTAGAACTGGGCCATGGGTAGGGTCAAGCAACTTCTCCTCCCAGCGTCGTCCTCGCGGTCTGTGGGAGACCTTCCTGGCTGTGGCAGCTGGTCATAGGCTGGGGTCTTGTGGAGTTATGCTTCCCAGGGCACCTGAGCGGGCTCCGGCTGGGGGGGATGGGGCGAGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@22_46.00777149275501
GGGGTCAGATGCCTGTGAATGGCCAAGAAACTTGGGCCATAGGGTGGTCAAGCAACTTTCTCTCTCCAGGCGTCAGTCTGCCGGTCTTGGGAGACTTCCTGTGGGCGAGCTGGAGTCAAGCGCGTTGGGTCTTGGGTCATCTTCCAGGGCATTGAGCGGGGCCCGGGCGGGGTGGGGAATCGGGGCCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@25_46.00777149275501
GGGGTCGATGCCCTGTAATGAGCGCAAGAAACTTGGGCCCATGGGTAGGGTAAGACGTTCTGCTGTGCCAGGCTCAGTCCTGTCGGTCTGTGGGGAGCCTCCTGTGGGCGCAGCCTGGAGTCAGGCTTGAGGGTTGGGGTATGCTCCCAGGCACCTTGAGCGGGCGCCGGGCTGGGTGGGGAGGTCCGAAGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@34_46.00777149275501
GGGGTAGATGCCCTGTAATGAGCCACAGAAACTGGGCCCATGGGTAGGGTCAGAGCACTCTCTCTCCAGTGCTCGTCTGCGTCTGGGGGAGCCTCCTCGGGGCTGCAGCTGAGTCAAGGCTTGGGGTCTTGGGCGTATGCTTCCAGGCATCCTGACGCGGCCCGGCTGGGGTGGGGATGGGGTCCGATGGAGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@94_46.00777149275501
GGGGTCAGATGCCTTAATGAGCCACAGAAACTTGGTGCCCATGGGTAGCGGTCACAGCACTTCTTCTCAGGCTCAGTCCTGCGGTCTGTGGGGAGACCTTCCTGTGGGCGCAGCTGAGTCAAGGCTTGGGGCTTGGGTATGCTTCCAGGGCATCCTGAGCGGCTCCGGCGGGTGGGATAGAGGGTCCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@3_45.75358490992213
GGGGTCAGATGCACCTTGAGAGCACAGAATACTTGGGCCCATGGGTAGGTCAAGTCAACTTTCTCTCTCAGGCTCTCCTGCGGTCTGGGAGGAGACCTTCCTCGTGGCCAGCAGTCAAGCTTTGGGGTCTTGGGGATGCTTCCCAGGCTCCTGAGCGGGCCCGCGCGCGGGTGGGGATGGGGCCGAAGGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@57_45.75358490992213
GGGTCAGATGCCCTAATGGCCACAGAAAGCTGGCGCCTGGGTAGGGTCAGCAACGCTTCTTCCTCTCAGGCTCAGCCTGCGGCTGTGGGAGACCTTCCTGTGGCGCAGACTGGGTCAGGCTTGGGGCTGTGGGGTATGCTTCCCAGGGCATCCGAGCCGGCCCGGACTGGGGTTGGGGTGGGGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@68_45.75358490992213
GGGCAGATGCCCGTCAATGAGCCACAGAACATTTGGGGCCCAGGTAGGTCAAGCACTTCTTCATTCCCAGCTCGCCTGCGTCTGTGGGGAGACCTTCCTGTGGGCGCAGCTTGGATCGAAGGCTCTGGGCTTGGGGTATGCTTCCCAGGGCATCCTGCACCGGCCGGCTGGGGTGGGGATGGGGTCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@38_45.49939832708923
GGGGCAGATGCCTGTAATGAGCAAGAAACTTGGGCCCTATGGGTATGGGTCAAGCAACTTCTCTCTCCAGCTCAGTCCGCGATCGTGGGGAGACCTTCCTTGGGCGCACTGGAGTCAATGGGCTGGGTCTTTGGGTATGCTTCCCAGCATCTGAGCGGGGCCCGGCTGGGGTGGGATGGCGGTCCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@54_45.49939832708923
GGGGTCAGATCTGTAAATGAGCGCGACAAAACTTGGGCCATGGGTAGGGTCAAGCTACTTTCTCCTCAGGCTCATCCTGGGTCGTGGGAGAGAGCCTTCCGTGTGGGCGCAGCTGGAGTAAGGCTTGGGTCTTGGGCACTGCTTCCCGGATCCTGAGCGGGCCCGGCTGGGGTGGGATTGGGTCCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@75_45.49939832708923
GGGGAGTCAGGATGCCGAGATGAGCCACAAAACTTCGGGCCCATGGGATAGTCAAGCACTTCTCTTCCGGCCCATCCTGCGGTCTTGGGGAGACTTCCTGTGGGGCAGCTGGAGTCAACTGGCGGGTCTTGGGGTAGCTTCCAGGGATCCGCTGGAGCGGGGCCGCGGGTGGGAATGGGGTCCGAGGATGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@88_45.49939832708923
GGGGTCAGATGCCCTGTAATGACCACAAGAAACTTTGGGCCCTATGGGTGGCGCTACGCAACTTCTCTCTCAGGCCAGTCCTGCGGTCATGTGGGAGCGACCTCTGTGGGGCAGCTGGAGTCAAGGCTTGGGGTTGGGGTATGCCCCAGGGCTCCGGCGGGCCCGGCTGTGGGGATGGGTCCCGAGGGCGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@12_45.24521174425633
GGGGTCAGACTTGCCCTGTATGAGCCACAGAACCTGGGCCATGGTAGGGTCAGCAACTCTCCTCACCGGCTCAGCGCTGCGGTTGTTGGGGAACTTCCTTGTGGGCTGCAGCTGGAGTCAAGGCTTGGGTCTTTGGCGGTGCTCCAGGGATCTGAGCGGCCCGGTGGGGTGGGGATTGGGGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@32_45.24521174425633
GGGCGGTCAGATGCCCTGTAATGAGCCAAGCAACTATGGCCCATGGGTGGGTCAAGCAACTTCTCTCTCCGCTCGTCCTGCGGTCTGGGGGAGACCTTCCTTGGGCCAGCTGGAGTCAAGGCCTTGGGGTCTTGGGTTTGCTCAGGGCACCTGGCGGGGCCGGCTGGGGTGGGGATGGGTGGTCACGGGG
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@53_45.24521174425633
GGGTGTTCCAAGATGCCCTAATGCAGCCACCAGACTTGGGCTCTGGGTAGGGTGCAGCAACTCTCTTCCAGGCCGCCTGGGTCTGTTGGGGAGACCTTCACTGTGGGCGAGCGTGGGTCAAGGCTGGTGGTTGGGGTATGCTCTCCCAGGCATACCGGCGGCCGGCTGGGGTGAGATGGGGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@99_45.24521174425633
GGGGTCGAGCCCTGTAATGAGCCACTGACTGTGGGCTCCATGGGTAGGTGTCAAGCAATTCTCTCCCAGCCAGTCCTTGGGTCTGTGGGGAGACCTCCTGTGGGCGCACTGGAGTCAGGCTTGGGGTCGTTGGGAGTATTCCCAGGGCCTCTGGCGAGGCCCGGCTGGGTGAGGGATGGGGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@13_44.99102516142341
GGGTCGATGCCCGTAATGAGCACAGAAAACTTGGGGCCCATGGGTTAGGGTCAACAACTTGCTCTCTCCAGGCTCAGTCCTGCGGCTGCGGGAGACTTCCTGTGGGCGCATGGAGTCAAGGTTTGGTCTGGGGTAATGCTTCCCAGGCATCCTCGGGCCCGCGCTGGGGTGGGGGATGGGTCCGAGGCT
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@44_44.99102516142341
GGGTCAGTGGACCCTTGTAATGAGCCCAAAACTTGGCCAGGCGTAGGTAGAATTCTCCTCCAGGCTAGTCCTCGTCTGTGGGGAGACCTTCCTGTGGCGCGAGCTTGAGAAGGCTTGGGGCTGGGGGTAGTGCCCCCAGGGCATCCGAGCGATGGGCCTCGGCGGGGTTTGGGGATGGGGTCCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@46_44.99102516142341
GGGCAGAGCACCCTGTAATGAGCCACAGAAATTCGGGCATGGGTAGGGTCAAGCAACTCTCTCTCCAGGCTCAAGTCCGCGGCCTGTGGGGAGACCTCACTGGGCGCGGGCAAGGCTTTGGGTCTGGGGTATGCTTCCCTAGGCCATCCTGAGCGGGCCGGTGGGGTGTGTGGATTGGGGTCCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@71_44.99102516142341
GGTGGTCAGATGCCCTGTAGACACAGAATCTTGCCCATGGTAAGGGTCAAGCAATCTCTCTCCAGCCAGCCTGCGCTGTAGGGGAGACTCTTCCTGTGGGCGTCGCTGGAGTCAAGGCTTGGGTCTTGGGGTAGACCCCAGGGCATCCCTAGCAGGGCCCGGCTGGGTAGGGCACTGGGGTCCGGAGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@45_44.73683857859053
GGGGTCAGAGCCCTGTATATGAGCCACGAAAGCTCCCTGGTAGGCAAGCAACTCTCTCTCAGGCTCGAGTCCTGTCGGTCCTTGGGGAGACCTTCCGTGGGCGCGCTGGAGTCAGGCCTTGGGTCTGGGGTATGCTTCCACGGGCATCCTGAGCGGCCCGGCTGGGGTAGGGAGTTGGGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@49_44.73683857859053
GGTCAGATGCCCTGTATGAGCGCACAGAAAACTGGGCCCATGGGAGGGTTCCAACACGTTCTCTCTCCAGGCTCAGTCGCAGGTTGTGAGGGAGACCTTCCTGTGGGCGCGCATGGAGGTAAGGCTGGGTCTGGTATGCTTCCCAGGCACTGAGCGGGCCCGGGCGGGTGGGGATGGGGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@10_44.48265199575763
GGGTCAGTGCCTTAAGTGACCAAGAAACTTTGGGCCTCGGGTAGGGTCAACAACCTTCCTCTCGCAGGGCTCAAGTGCTCGGTCTGTGGAGACCTTCCTTGGGCGCAAGCTGGAGTCAAGGTCTTGGGTCTTGGGTATGCTTCCCAGGATCCTGAACGGCCAGGCTGGTGGGGATGGGGTCCGAGGG
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@14_44.48265199575763
GGGTAGATGCCTGTCAATGTAGCCAAAAACTGGCCCATGGGTAGGGTAAGCAACTTCTCTCTCCAGGCTCAGTCTGCGTCTGTGGGGAGACCTTCCTGTGGGCGCGCTGGGTCTACGGCTTGGGTCTTGAGGTATGTTCCCAAGGCATCCTGAGCTGGCCGGCTGGGGTGGGGATGGGGTCCGAGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@35_44.48265199575763
GGGGTCAGATGCCCTGTATGGCCAAGAACTGGGCCCATGGGGTCAGGTCAAGCAACTTCTTCTCAGGCTCGTCCGCTGGCTGTGGGGAGGACCTTCCTGTGGCGCGCTGGAGTCAAGGCTGGGTCTTGGGAGTATGCTTCCCAGTGGCACCGAGCGGGCCCGGCTGGGGTGGGGAGGGTTGCCGAGG
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@79_44.48265199575763
GAGCTGCCCTGTAATCGATGCACAGAAGACTTGGGCCCATGGTAGGTCAAGCAACTTCTCTCTCCAGGCTCGTCTGCGGTCTTGGGGAAGACCTTCCGTGGGCGCAGCGATCAAGGCCTTGGGTCCTTGGGGTATGCTTCCCAGGGCATCACTACGGCCCGGCTGGGGTGGGGAATGGGTCCGAGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@87_44.48265199575763
GGGGTCAGATGCCCTGTAATGAAGCACAGAAACTTGGGCCACATGGGTAGGGTCAACAACTTCTCTCTCCAGGGCTCAGTCCGCGGTCTGTGGGTCCTCCTGTGGGAGCTGGGTCAAGGCATGGGGTCTTGGGGTAGCTTCCGGGATTCCTGGTCGGGCCGGCTGGCGTGGGATGGGGTCCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@92_44.48265199575763
GGGTCGGCCTGTAATGAGCACAGAAATCTTGCGGCCCATGGAGTAGCGTCAACATTCTCTCTCCGCTCAGTCCTGCGGCTTGTGGGGAGACTCTTCTGTGGGTCGCAATGGAGTCAAGGCTTGGGTCTTGGGATATGCTCCAGGCACCTTGAGCCGGGCCCGGCTGGGGTGGGGATGGGGTGCAGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@37_44.22846541292473
GGGGTCAGATGCCCTGTAATAGCAGCAGAGAACTTGGCCCATGGGGCTGGTGCAAGCAACTTCTCTCCCCAGGCTCAGTCTGCGGTCTGGGGGAGACCTTCTGGGGACGCAGCGAGTCAAGGCTTGGGGTTCTGGTATTCTCCCAGGCATCCGACGGCCCGGCTGGGGTGGGGATGGGGATCCGAG
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@59_44.22846541292473
GCGGGTCAGATCCTGTAATAGCCACAGAACTTGGGCCCTGGGAGGGTCAAGCAACTTCTTCTCCAAGCTCAGTCCTGCGGTCTGTGGGGAGACCTTCCTGTGGCAGCACTGGAGTCAGGCTGGGGATCTATGGGTATGCTTCCCAGGGGCATCCTGAGCGGCCCGCTGGGTGGGAGGCGTCCGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@78_44.22846541292473
GGGTCAGAATGCCCTGTATGGCCACGAAACTGGCGGCCCATGGGTAGGGTCAGCAACTCTCCTCCAGGTCAGTCCGCGTCTTGGGGAGACCTTCCATGGCGCAGCTGGACGTCAAAGGCTTGGGTCTGGGTAGCTCCCAGGGGCATCCTGAGCGGGGCCCGGCTGGGTGGGGATGGGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@84_44.22846541292473
GGGGCAGATGCCCTGTAATAGCACCAGAAACTTGGGCCAATGGGTAGGTCAAGCTAACTTCCTTCAGGCTCGTCTGCGTCTGTGGGAGGACCTCTCGCTGGGGCACTGGGAGTCAGGCTTGGGTCTTGGGGTAGCTTCCCAGGGCCATCCTGAGCGCCGGCGGGGTGGGGATGGGTGCCAGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@27_43.97427883009181
GGGTCAGATGCCCTGCTGAATGACCCAGAAACTGGCATGGTAGGGTCAAGCAACTTCTCTCTCCAGGCTCAGTCCGCGGTTGTGGGAGACACCTTCTGTGGGGCCAGCTGGAGCAAGCTTGGGGTCTTGGGGTATGCTTCCAGGGCATCCGAGCGGCCCGGCTGGGGTGGGTGGGGTCCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@80_43.97427883009181
GGGGTCGACCCTGTTAATGAGCCACAGAAACATTGGGCCCATGGTAGGGTCAAGCAACTTCTCTCCAGGCTCAGTCCCTGCGGTCTGTGGTGGAGACCTTCTGTGGGCGCTAGTGGAGTCAAGCTTGGGGTCTTGGGTATGCTTCCCAGTGGCATCTGAGGGCCGCTGGGGTGGGGAGGGGTCGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@4_43.720092247258926
GGGGTCAGACCTGAATGAGCCCATGCAAACTTGGCCCATGGGTAGGGTCAAACACTTCTCTTCTCCAGGCTAGTCCTGCGGTCTGGGGGAACTCCTGTGGGCGCAGTGGAGTCAAGGCTGGGGTCTTGGGGTATCCTCCCAGGGCACCTGGCGGCCCGGCTGGGGTGGGGATGGGGTCCGAGGG
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@8_43.720092247258926
GGGGTCAGAGTGCCTGTAAGAGCCACAGAACTTGGTGCCCCATGGGTAGGGTCATAGCACTTCTTCTCAGCTAGTCTGCTGTCTGTGGGAGACCTTCCTGTGGCGCAGCTGGAGTCAAGGCTTGGGGTTTGGGGTAGCTCCCAGGGCATCCTAGCGGCCCGTCTGGGGTGGATGGGGTCCGAGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@77_43.720092247258926
GGTCAGATCCCTGTACATGAGCCACAGAAACTTGGCTGCATGGGTAGGGTCAAGGCAACTTCTCTCCAGGCTAGTCCTGCGGTCGTGGAGATCCTCCTGGGGCCAGCTGGGTAGGACTTGGGGGTTTGGGGTATGCTTCCCAGGCATCCTGAGCGGGCCCGGCTGGGGGGATGGGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@97_43.720092247258926
GGGGTCAGAGCCCTGTAATGAGCCACAAAACTGGGCACCATGGTAGGGCTCAAGCAATCTTCTCCAGCTTGTCTGCGGTCGTGGGGAGACCTTCTCGTGGCGCAGCGGAGCAAGGCTTGGGTCTTGGGGTATGCTTCCAGGGATGCCTGAGCGGGCCCGCTGGGGCTTGGGAGGGGTCCGAGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@0_43.465905664426025
GGGTCAGATGCCTGTAATGAGCAACACAACTGGGCCCATGGTAGGGTCAACAATTCTCTCCAGGCCAGTCCTGGCGGTCTCGTGGGAGACCTTCTGGGGTGCAGCTGGGAGCAAGCTTGGGTCTTGGGGTATGCATTCCAGGGCATCCTCGGCGGGCCGGCTGGGTGGGGATGGGTCAGAGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@16_43.465905664426025
GGTCGATGCCCTGTATGAGCCACAGTACTGGGCATGGGTAGGGCAAGCAACTTCTTCTCCGGTCAGCTTGCGTCTTGGGGAGACCTTCCTGTGGCGCAGCTGGAGTCAAGGCTATGGGGTCTTGGGGTATGTCCCAGGGCATCCTGAGCGGCCCGGCTGCGGGTGGGGAGGGGTCCCGAGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@20_43.465905664426025
GGGGTCGAGATGCCCTGTAATGAGACAGAAAACTGGAGCCCGGGTAAGGGTCAAGATTCTCTCTCCAGCTCGCTGCGCTGTGGGGAGACCTTTCCGTGGCGCAGCTGGAGTCAAGGCTTGGAGGTTGGGGATGCTTCCCAGGGCATCCGACGGGCCCGGTGTGGGTGGGGATGGGTCACGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@23_43.465905664426025
GGGTCAGATGCCCTGTGATGGCACACGAAACCTTGCCATGTAGGTCAAGCACTTCTCTTCCAAGGCTCCGTCCTGCGGTCTTTGGGGAACTTCCTTGGGGCGAGCTGGAGTCAAGCTGGGTCTTGGGTATGCTTCCCAGGGCACCTGGCGGCTCCGGCTGGGAGTGGGGATGGGGTCCGGGGG
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@24_43.465905664426025
GGGGCGATGCCCTGTGAATGAGCCACGAAACTTGGGCCATGGGAGGGTCAAGCTTCCTCTCCAGGTCTCAGTCCTGCGGTTGGGGGGACCTTCTTAGGGCGCGCGGAGTCAAGGCTTGGGCAGTCTGGGGTTATGCTCCCAGGGATCTGAGGGGCCGGCTGGGGTGGGATGGGGTCCGAGCGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@51_43.465905664426025
GGTACAGATGTCCCTGTAATGAGCCACTAGAAACTTGGGCCCTGGTAGGGTAAGCACTTCTCTCCCAGGCTAGTGCTGCGGTCTGTGGGGAGATTCCAGTGCCGAGCTGGATCAAGGCTGGGTCTCTGGGGATGCCTTCCCAGGGCATCCTGAGCGGGCCGCTGGGTGGGATGGGGTCGGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@66_43.465905664426025
GGGGTTCAGAGCCTGAATGAGCCACAGAAACTGGCCATGGTAGGTCAGCAACTTCTCTCTCCACTCAGTCTGCGGTCTTGGGAGGACCTTCCCTTGGGCGCAGCTGGAGTCAGTTGGGGTTGGGGTATGCTTCCCAGGGCTATCCTGAGCGGGCCCGGCTGGGGGGGGACTTGGGGTCCGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@85_43.465905664426025
GCGGCAGATGCCTTGTATGAGCCCAAAACTTCGGGCCATGGGTAAGGGTCAAGAACTTTCTCTCAGGCTCAGCCTGCGTCTTGGGGACCTACTGTGGGCGCAGCTGGAGTCAAGGCTTGGGTCTTGGGTATGCTCTTCCCAGGCATCTGAGCGTGGCCCGGCTGGGTGGGATGGGTCCGAGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@98_43.465905664426025
GGAGGTAGATGCCCTGAATGAGCCAAGATAACTTGGGCCCATGGAGTACAAGCAACTCTTTCCAGCCAGTCCGCGGTCTGTGGGGTAGACCTTCTGTGGGCGCAGCGGAGAGGCTTGGGGTCTCTGGGGTAGTGCTTCCCGGGCATCTGGCGGCCGGCGTGGGGATGGGATGGGATCCGAGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@65_43.21171908159311
GGGGTCATGCCCTGTAATGAGCCACAACTTGGCCCCATGGGTAGGGTCAAGCACTTCTCCTCCAGGCCGTCGCGGTCGTGGGTAGACCTTCCGTGGGCGCAGCTGGAGTCAAGGCTTTGGGTTGGGTGATAGCTTCCAGGGAGTCCTGAGCGGCCCGCTGGGTGGAGGATGGGGTCCGAGGG
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@89_43.21171908159311
GGTCAGACTCCCGTTGAGCCACAGAAACTTGGGCCCTGGTAGGGTCAAGCAATTTCTCTCAGGCTCAGTCCGCGGCTGGGGACCCTTCCTGGGGCGCAGCTGAGTCAAGGCTTGGGGTCTTGGGGTATGCTTCCAGGCATCCTGAGCGGGCCCGGGCATGGATGGGAGTGGTGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@72_42.95753249876021
GGGGTCAGATGCCCGTAATGAGCCACAAAACTTGGGCCATGGTAGGGTAAGCATTCTCCCAGGCTCAGTCTGCGGTCTGTGGGGACCTTCCGTGGGGCGCAGTGGAGAAGGACTTGGGGTCTTCGGGTATGCTTCCAGGCATCCTGGATGCGGTGCCCGGCTGGGGGCATGGGGTCCAGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@7_42.70334591592731
GGGAGTAGAGCCCTGTATGAGCCACAGAACTGGGCCACATGGGGTAGGGTCAGCAACTTCTCCAGGCTCAAGTCTGCGGTTGGTGGGGGCTTCTGTCGCAGCTGGATCAGGTCTTAGTGGGTCTTGGGGTTGCTTCCAGGGATCCTGAGCGGGCCGCTGGGGTGGGTGGGGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@28_42.70334591592731
GGGTCATGCCCTGTAATGAGCCAGAAACTTGGCCCAGTAGGGCAGCAACTTCTCTACTCGCAGCTTCGTCGCTGCGGTGTGGGGAGACCTTCCTGTGGGGCGAGCTGACGTCAAAGGCTTGGGGCTTGGGTTGCTTCCAGGCACCTGAGCGGCCCGGCTGGGGTGGGAGGGGTCCGAGGG
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@64_42.70334591592731
GGGGTCAGATGCCGTATGAGCCCAGAAACTTGGCCCAGTGGGTAAGCACACTTCTCTCTCCAGGCTCAGTCCGCGGGTTGTGGGGAACCTCCTGTGGCGCACTGAGTGCGAAGCTTGGGGTTTTGGGTATGATTCCACGGGCATCCTGAGCGGGCCCGGCGGGGGGGATGGGGTCAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@95_42.70334591592731
GGGTCAGGCCCTGATAATGAGCCAACGACTTGGGCCCTGGTAGTCAAGCAACTTCCTCCCAGGCTCAGTCCTGCGGCTGTGGAGACCTCCTGTGGGCGCGCTGGAGCAAGGCTGGGGTCTCTGGGGTTAGCTTCCCAGGGCATCCGAGCGGGCCCGGCGGGGGGGATGGGGTCCGAGGGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@96_42.70334591592731
GGTCAGATGCCCTTATGAGCCACAGACTTGGGCCCATGGGTAGGGTCAAGCAACTCTCTCTCCAAGGCTCAGTCCTGGGTCTGTGGGGAGACCTGGGCGAGGCTGAGTCAAGCTTGGGGTTGGGGATGCTTCCCAGGCAAATCCTGATGCGGGCCCGGCTGGGTGTGGGATGGGGTCCGC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@67_42.44915933309441
GGGGCAGAGCCCTGAATGAGCCACAGCAACTTTGGGCCCATGGTAGGTCAAGAACTTCCTCTCAGGCCTCAGTCTGCGGTTGTGGAGACCTTCCTGGGCGCAGCTGGAGTAGGCTTGGGGTCTTGGGGTATGCATCCCAGGGCACTGAGGGGCGGCTGGGGTGGGATGGGGGTCCGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@86_42.44915933309441
GGGGTCAGATGCCTGTATAGCCACAGAACTGGGCCATGTAGCAAGCAACTTCTCCTCTAGGCTCAGTCTGCGGTCTGTGGGGAGCCTTCCTGTGGGGCCTGGAGTCAAGGCTTGGGGTCCTTGGGGTATGTTCCCAGGCATCCTGAGGGGCCCGTGGGTGTTGGGGGATGGGTCCGGGC
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@55_42.194972750261506
GGGGTCAAGCTGTAATGGCACGAGAATTGGCCCACTTGGGAGGTAAGAAACTTTTCTCTCGGCTCAGTCTGCGGTCTGTGGAGACCTTCCTTGTGGCGGCAGTCTGAGTCAAGGCGGGGTCTTGGGGGCCCCAGGGGCACCTGAGCGGCCCGGCTGGGTGGGCGATGGGGTCCGAGGG
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rvaser commented 5 years ago

@ksahlin, could you please try convex gaps implemented on branch temp? I am trying to find some parameter combination which should yield better alignments but am failing at it.

ksahlin commented 5 years ago

Will do! Thanks!

rvaser commented 5 years ago

@ksahlin, any progress? :P

ksahlin commented 5 years ago

Hey @rvaser, some unexpected things came up and I've been very busy lately and will continue to be over the next two weeks. I'll let you know if I find some spare time to work on this upcoming week, otherwise I'll continue to work on this project in two weeks.

rvaser commented 5 years ago

Okay, no problem :)

ksahlin commented 5 years ago

Hi, just got to this evaluation. I performed an indirect evaluation as follows.

I generated 100 instances of similar nature to the data above, i.e., 100 reads each, with a fraction of them having a 50bp delation. That is, reads are either ~150 or ~100bp long (depending on if they have the delation or not). The error rate of reads is set to 16%. In 10 of the 100 instances, a fraction p of reads contains the 50bp deletion, with p=0.1,0.2,...1.0.

I then reconstructed the reference with a customized longest path, sort of... (i.e., not the heaviest path default) and looked at the reconstructed sequence for spoa_affine vs spoa_convex. The results turned out slightly, but significantly (in statistical since), better for me with convex!

Parameters for spoa_affine (latest master): -l 2 -r 2 -x -4 -m 10 -g -8 -e -1 Parameters for spoa_convex: -l 2 -r 2 -m 10 -g -8 -e -2 -q -24 -c -1

So this is an improvement in my analysis and a welcomed update to spoa, thank you!

I cannot check the implementation on a detailed level (like finding bugs/corner cases) -- but the end results turn out better, so thats a good sign.

Also: I get invalid option -- x for spoa-convex, thats why I put -x -4 in the affine version cause I assume thats the default. (As for trying suitable parameters, I mimicked the default ones used in minimap2)

rvaser commented 5 years ago

Well that is great! I'll add the SIMD version in the next days then, will need you to test it again please :)

Option -x has been replaced with -n.

rvaser commented 5 years ago

@ksahlin I finally managed to implement the SIMD version (on branch temp). Could you please try it out? I'll merge it to master afterwards :)

rvaser commented 5 years ago

Convex gaps are now on master. I'll close this issue, feel free to reopen if something is not right :)