isovic / graphmap

GraphMap - A highly sensitive and accurate mapper for long, error-prone reads http://www.nature.com/ncomms/2016/160415/ncomms11307/full/ncomms11307.html Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/graphmap2
MIT License
178 stars 44 forks source link

segfault on some reads #43

Open robegan21 opened 7 years ago

robegan21 commented 7 years ago

Hello, I'm getting a segfault when building the latest version: 817ba03fc19e0fb2076381eb9536ff99240bd468

Using this graphmap command line: graphmap align --threads 1 --ref Chlamy_and_lambda.fa --index Chlamy_and_lambda.fa.graphmap.gmidx --reads y.fastq -o y.sam

using the following reference: http://portal.nersc.gov/dna/RD/Adv-Seq/ONT/Chlamy_and_lambda.fa

and one of the reads that triggers it is: http://portal.nersc.gov/dna/RD/Adv-Seq/ONT/y.fastq

gdb of the core dump narrows it down to codebase/seqlib/src/libs/edlib.cpp:1150

I suspect there is in integer overflow somewhere since blockIdx is suspiciously large, but I don't understand the code very well:

#0  0x000000000040ddef in obtainAlignmentHirschberg (
    query=0x1d962f5 "CGCCTTCACACACACACACACCACCGCGAACCCACCCACCACCCGCACCACCCACCGCACCACCCACCCGCCGCCCACCCACCCACCCACCACCACCCACACCGCACCCACCACACCCACCCACCACACCCACCACCACCACCCACCCACCCACCACCCACCCACCCACCACACACACGCACACCACCCACCACCACCCA"..., 
    rQuery=0x1db0560 "GAGAGAACACGTTTGCGCGAGCACTTTTACGTCCGCACCCGCCACGCGAACCGTCGAGTTCCCCTTCCTCCCGACCACCGCCGCCGCCGTTCCCGCTTGACTGCCGCGTCCTCAGCGCCTTCGACGCCGTCCAAACACACCCGCCTCCTTACCTCCTCTCCCGCCGGCCACACAACACGCCTTTCGCTCCCGGCACTCCC"..., queryLength=2240, target=0x7fd4ddab20ae 'N' <repeats 200 times>..., 
    rTarget=0x1dba150 "TTGTGCGTGCGCGTCACCCTACTCGCGGACCTGCGCCGCGACCTACGGAGCGACCGCGGCGAAACGACCCCCGCGCTGGACCGAGTGCACATGCCGTCGGGTGCGCCGAAGTACCGCCGTGTGGGGCTCGACAAGCGCAAGGCTGTCCACTCCCCCCCCCCGACCCCACACACACCCACACCCACTCCACACACCCCCAC"..., targetLength=14666, alphabetLength=128, bestScore=12426, alignment=0x7fff02f76180, alignmentLength=0x7fff02f7617c) at codebase/seqlib/src/libs/edlib.cpp:1150
1150                        alignDataLeftHalf->scores[blockIdx]);
(gdb) list
1145        // and ending with scoresLeftEndIdx row (0-indexed).
1146        int scoresLeftLength = (lastBlockIdxLeft - firstBlockIdxLeft + 1) * WORD_SIZE;
1147        int* scoresLeft = new int[scoresLeftLength];
1148        for (int blockIdx = firstBlockIdxLeft; blockIdx <= lastBlockIdxLeft; blockIdx++) {
1149            Block block(alignDataLeftHalf->Ps[blockIdx], alignDataLeftHalf->Ms[blockIdx],
1150                        alignDataLeftHalf->scores[blockIdx]);
1151            readBlock(block, scoresLeft + (blockIdx - firstBlockIdxLeft) * WORD_SIZE);
1152        }
1153        int scoresLeftStartIdx = firstBlockIdxLeft * WORD_SIZE;
1154        // If last block contains padding, shorten the length of scores for the length of padding.
(gdb) p blockIdx
$1 = 1483722392

Here is the output:

16:49:26 Index] Running in normal (parsimonious) mode. Only one index will be used.
[16:49:26 Index] Index already exists. Loading from file.
[16:49:27 Index] Index loaded in 1.25 sec.
[16:49:27 Index] Memory consumption: [currentRSS = 2103 MB, peakRSS = 2104 MB]

[16:49:27 Run] Automatically setting the maximum allowed number of regions: max. 2346, attempt to reduce after 0
[16:49:27 Run] No limit to the maximum number of seed hits will be set in region selection.
[16:49:27 Run] Reference genome is assumed to be linear.
[16:49:27 Run] Only one alignment will be reported per mapped read.
[16:49:27 ProcessReads] Reads will be loaded in batches of up to 1024 MB in size.
[16:49:27 ProcessReads] Batch of 1 reads (0 MiB) loaded in 0.00 sec. (38832296 bases)
[16:49:27 ProcessReads] Memory consumption: [currentRSS = 2103 MB, peakRSS = 2104 MB]
[16:49:27 ProcessReads] Using 1 threads.
[16:49:27 ProcessReads] [CPU time: 0.00 sec, RSS: 2103 MB] Read: 0/1 (0.00%) [m: 0, u: 0], length = 25116, qname: 72a2fdd0-ab17-47cb-982b-c5d663753a66_Basecall_...Segmentation fault
isovic commented 7 years ago

Hi Rob! Hm this is interesting, there seems to be a problem with the alignment library. Let me look into this and report back soon. Thank you for such a detailed report and the data, it will be very useful!

Best regards, Ivan.

gringer commented 7 years ago

I've had this same problem. I've attached an example fastq file (2.5kb read) + reference set (12 barcode sequences) that has this problem, as well as another reference sequence set (the first 4 sequences from the segfaulting set) that doesn't segfault.

graphmap_segfault.zip

gringer commented 7 years ago

And here is another sequence (652bp) that segfaults for me on the same smaller reference sequence set of 4 barcodes, but not with the larger set of 12 barcodes:

graphmap_segfault2.zip

gringer commented 7 years ago

The segfault for me is in the region_selection code:

0x0000555555676921 in GraphMap::RegionSelectionNoCopy_ (this=0x7fffffffd700, bin_size=217, mapping_data=0x7fffe6eccb80, indexes=..., read=0x555555a50e80, parameters=0x7fffffffd720) at src/graphmap/region_selection.cc:142
142             if (last_update_chromosome[reference_index][position_bin] == (i + 1)) {
(gdb) bt
#0  0x0000555555676921 in GraphMap::RegionSelectionNoCopy_ (this=0x7fffffffd700, bin_size=217, mapping_data=0x7fffe6eccb80, indexes=..., read=0x555555a50e80, parameters=0x7fffffffd720)
    at src/graphmap/region_selection.cc:142
#1  0x0000555555686983 in GraphMap::ProcessRead (this=0x7fffffffd700, mapping_data=0x7fffe6eccb80, indexes=..., read=0x555555a50e80, parameters=0x7fffffffd720, evalue_params=0x555555a51040)
    at src/graphmap/process_read.cc:44
#2  0x00005555556a76a5 in GraphMap::ProcessSequenceFileInParallel () at src/graphmap/graphmap.cc:374
#3  0x00007ffff79a66b6 in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
#4  0x00007ffff7bc4464 in start_thread (arg=0x7fffe6ecd700) at pthread_create.c:333
#5  0x00007ffff71bc9df in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105

This looks suspicious:

(gdb) print hit_vector.size()
$30 = 9
(gdb) print hit_vector[1]
$31 = (__gnu_cxx::__alloc_traits<std::allocator<long*> >::value_type &) @0x7fffe0000988: 0x0
(gdb) print hit_vector[2]
$32 = (__gnu_cxx::__alloc_traits<std::allocator<long*> >::value_type &) @0x7fffe0000990: 0x0
(gdb) print hit_vector[3]
$33 = (__gnu_cxx::__alloc_traits<std::allocator<long*> >::value_type &) @0x7fffe0000998: 0x0
(gdb) print hit_vector[4]
$34 = (__gnu_cxx::__alloc_traits<std::allocator<long*> >::value_type &) @0x7fffe00009a0: 0x555555a4de90

This code makes me uncomfortable (a vector of pointers):

        clock_t diff_find_seeds = clock();
        std::vector<int64_t *> hit_vector;
        std::vector<uint64_t> hit_counts;
gringer commented 7 years ago

curious... lengths of all sequences (4 reverse/forward, so 8 total) in the reference_lengths array is correct, but y_local is not:

(gdb) print y_local
$24 = 17499632
(gdb) print l_local
$25 = 17499588
(gdb) print index->get_reference_lengths()[0]
$26 = (__gnu_cxx::__alloc_traits<std::allocator<unsigned long> >::value_type &) @0x555555a4f0f0: 38
(gdb) print index->get_reference_lengths()[1]
$27 = (__gnu_cxx::__alloc_traits<std::allocator<unsigned long> >::value_type &) @0x555555a4f0f8: 38
(gdb) print index->get_reference_lengths()[2]
$28 = (__gnu_cxx::__alloc_traits<std::allocator<unsigned long> >::value_type &) @0x555555a4f100: 38
(gdb) print index->get_reference_lengths()[3]
$29 = (__gnu_cxx::__alloc_traits<std::allocator<unsigned long> >::value_type &) @0x555555a4f108: 38
(gdb) print index->get_reference_lengths()[4]
$30 = (__gnu_cxx::__alloc_traits<std::allocator<unsigned long> >::value_type &) @0x555555a4f110: 38
(gdb) print index->get_reference_lengths()[5]
$31 = (__gnu_cxx::__alloc_traits<std::allocator<unsigned long> >::value_type &) @0x555555a4f118: 38
(gdb) print index->get_reference_lengths()[6]
$32 = (__gnu_cxx::__alloc_traits<std::allocator<unsigned long> >::value_type &) @0x555555a4f120: 38
(gdb) print index->get_reference_lengths()[7]
$33 = (__gnu_cxx::__alloc_traits<std::allocator<unsigned long> >::value_type &) @0x555555a4f128: 38
(gdb) print index->get_reference_lengths()[8]
$34 = (__gnu_cxx::__alloc_traits<std::allocator<unsigned long> >::value_type &) @0x555555a4f130: 0
gringer commented 7 years ago

Running through clang++ (with g++ for things that needed omp, because my version of clang can't find omp.h), I got the following error:

src/index/index_spaced_hash_fast.cc:844:22: warning: comparison of unsigned expression < 0 is always false [-Wtautological-compare]
        if (hash_key < 0 || hash_key >= num_kmers_) {
            ~~~~~~~~ ^ ~

Changing hash_key here to int64_t (was uint64_t), consistent with the other code areas where hash_key is used, made this error go away (but the segfault still occurs).

Clang reported a couple of other errors associated with printing text:

src/program_parameters.cc:110:109: warning: expression result unused [-Wunused-value]
    ss << SOFTWARE_NAME << " - A very accurate and sensitive long-read, high error-rate sequence mapper\n", SOFTWARE_NAME;
                                                                                                            ^~~~~~~~~~~~~
src/program_parameters.cc:257:109: warning: expression result unused [-Wunused-value]
    ss << SOFTWARE_NAME << " - A very accurate and sensitive long-read, high error-rate sequence mapper\n", SOFTWARE_NAME;
                                                                                                            ^~~~~~~~~~~~~
isovic commented 7 years ago

Hi David, thank you for the detailed report! I'll take a thorough look soon - I'm currently working on the initial comment from this issue. Unfortunately, my time is a bit thin at the moment, so it might take a few days until I finally address these.

Best regards, Ivan.

ocxtal commented 7 years ago

Hi Ivan,

I had a similar problem on my environment. The GraphMap (v0.3.1) segfaulted when aligning reads onto the D.melanogaster (dm6) genome.

And here is my investigations: (line numbers are based on the current master: e556a5f)

So I wrote a dirty patch in order to make the blockIdx valid when the myersCalcEditDistanceNW function returned at edlib.cpp:756, adding *alignData element initializations above the return statement. It successfully avoided the segfault on my environment. Though I cannot determine this patch is a correct way to avoid multiple N sequences, however, I'll be glad if the patch helps you investigate the problem deeper.

https://github.com/ocxtal/graphmap/tree/edlib_segv_patch https://github.com/ocxtal/seqlib/tree/edlib_segv_patch https://github.com/ocxtal/seqlib/commit/050873d3f6bd8c9a6870da0eba6416c9e85d8991

Some debug stuffs below:

The reference sequence was obtained from the UCSC Genome Browser download site and passed to the software without modification. http://hgdownload.cse.ucsc.edu/goldenPath/dm6/bigZips/

The query sequence:

@S4_13395 id=0.681210 len=18305
CACCACGTTTACGGGACCAGTTCGGGTGACCGAGTACGAACCGAGTACCGGACCTAGACGGGACACGAGTACGAAGACGGACCAGAACGGGGGGACCAAGTACGGGACCAGGTTAGGGAACACAGTATGAGGGACCCAGGTACCGGGACCCAAGTAGGGGATCAAGTGACGGCCACGAGTACGGGACCAGGTACGGGAGCCAGTACGGAAGTGCGAGTACGGAACCAGTCGGGGGACCCCAGTACGGGAATCAGAGTATGGGACGAGGTACCAGTGGTACCCCCAATACGGGACCCCAGGTACGGGGGCCAGTACGGGGAACCATACGGGACCAGGTACGGGACCAGTACGGGACCAGTCGTGTCCCCCGGCACAGGCACCCCAGCTAAAGGTACCAGGTACCGGGACCAGGTAGCGGGACCCAGTGTCGGGAACCAGTACGCAGGACCAAGACGGGTCACAAAGTTCTCGGGCCAGTCGGGAGACTATGTAAGGGGACCAACCAGGATGGGAGGCGAGTACCGGACGAGTTACCGGGAGACCATTACGGAACCATGTTACGTGGACCAGTTCACGGAAACCGTAGTACGGACGATACGGAACAAGAACGAGAAACCATAGGAAGCGGTGAGTTACCGGAACCCGAAAGGTTACTCGGAACCCAGTTACGGATACGTAGTAACGGGACCGAGTACGGGTAAGGGAGCCAGTACGGGCACCAGTAGCGGGGACCAGGAAAGCCCGGGGAACAGTTAGGGGACCATACGGTGACCAGCCCGCGGCCAGCGTTAGGACCAAGTACGGGGGAACCCAGATTCAGACCAGTCAGACCAGTACGGGTACCGGAGTACGGAAAAAGTAGTACCGGAACACCGAGTACGGAACGAGTCCCGCGACAGATAGGGGACCAGTACGGGACCACATTAGGGACCAGTATCCGGGACCGAAGTCACGGGACCAGGTCGGGGAAGCCCAGAGGTCGGGACCAGTGCGGGTACCAGGTACGGGCCAAGCTACGCGGGAACCAGTTACGGGGACGCGGGTCGGGAACCAGTACGGGGACGAGTACGGTACCGCGAGATACCGGGAACCAGTTATGGCTACCGAGTTACGGAATACCGAAGTACGGAACCGAGTATGTAAACCGAGTAAACGGAACCAGTATACCCGGGGACCTGGTACACCGGAATTATAGTACGGGAACACACCCGAGTTAGGGTTACGGCACCGAAAGTACGGGGAACCAGGAAAGGGGGTACAGGCTACGGGACCCCAGTAACGGGACCCAGACGTGACCAGACCAGGGGCGCAAGACGGTACCAGTACGGGAGCCGAGGTACGGAACCAGTCATCGGACCCGATGTAACTGAACCGAGTCGGAGCCGAATGACGAAACCGACGTACCGGAGCCTAGACGGGAACTTACTTGCGACCCAACCTAACGGATCGAGAACGGGACCGAGTACGGGGACCCTACGGGACCAGTACGGGGGAGTACGGGCCAGTACCTAGAACTATTACGGGGAAGAACCAGTACGTGACCGTACGGTGACCAGCACAGGGCAGCACGGTACGAGTCAGGGCCAGTACGTGTCCCCACTAACGAGGGAGCAATACGGGACCGTAAGGGGACCGTACGGGGACCGGAGACCGGAATCTCAACGTCGTGAGACACAATTACGGAACAGTACGGGAACCGAGGCACGGACCGATGTACCGGAACCACCGCGACGGAACCGGACGGAACCCGAGTACCGTGAATCGAAGTACGGGACCGAGTTAACGAGCACGGGACCCAGGTACGGGACCCAGTCGCGAGGACCAGTACGGGACCAGTCGGCACCAGTACGGGAACCAGTACGGGGACCCAAAGTACGGACCCAGTCACGGGAGCAGTAACCCGGACGAAAGTTTACCCGTGAACCAGTTACCCGGGCGCGACGTAAGCGGGGACCAGTTATGGACCGAGAAGTACGCGGAAACCGAGTGACGTGACCCTTAGCGGAACCAGACGGGCACCCGAGTAACGGCAACGAGGTACGGGAACCCCCGAGTTACTGAACGAGGCGAAGCCAGTAACGGGTGAACCCATTTCTATCTGAATGGGATACGAGAACCGGAGTACGAGTACGGGATCCGAGTACGGGACCTGTAACTGAGACCGTGTACGGGAACGCGTAGTATCTAACCAGTACGGAATCGGTGACTACGGAACCCAGGTAACGAGTACGGGACCAAGTCCGGGGGAACCAGACGTTGCGGAACCGGTACGGGAACCCCAAGTCTCGGAATCGAGTACGGGACCGATAACTGGAGTACGGAACCAATACGGGACCAGAACGCGGTACCAGTACCGGGCACCAGCCCAATCGAGGACTACAGGTACGGGTCCGAGGTACGGGGACCAGGTTACGGAACCGTGACCGGGACCACGTACGGGACCAAGTACCGGGGACCAGGCACGGTACCGAGGTACCCGCGACACTAGAACGGGCACCAGATACCGGGAACCGATGAACCGGAACCCGAGGTACCAGGAACCAGTAAGCACCGAAGTAACGGAACCCCTGGCTAACGGAATCCGATGTACTGAACCCAGTACGGAACCAGTAACGGGGAACCCCAGAGTACTGGGAATATCGAGTACAAGGACCGCGAATTACGAAGTTTACGGGACCGAGTTAACGGGACCAGTAACGGGGACCAGTAACGGGCAGGTCGGGACCCAAACGGACGATACGACAGTCGGGGTACCCAGTACGGTGACCCAGCACAGGGCGCCAGCCACGGTACCAGATAGCGGGAACATGACGGCAATAGTACGGGGACGGAGACGAACATAGCCGGGACCCGAGTACTAACCCGAGACGTGACCCGAGTACGGAAAACCCGAGTGCGGAACCAGATGACGGGCGAACCTCCACTACGGAATTCGATTACGGGACCGAAGGTACGGACCCAGGTAACGGAGCCAGGTATCGGACCAGTACGGGGCCTCAGTTACTGACAGTACGGGACACCAGTACCGTGACCACAGTACGTGATCCAGCCCAAGGCCCAGCAACCGGTTACCGAGTTACGGGAGCCAGTTACGGGACCAGTACGGACGAGTTATGGGGTAACCACCATGGGACCGAGTACGAACCAATGTACGTGACCCAATTTACGGAATCAGTAGCGGGACCGTGTAGGAAACCGAGTAACGGGAACCGAGTCGAACCCGAGATGGACCCAGTATCGGAATCCGAGTACGGGACCCAGTACCGATGTACAGGAACATTACGGGGGACCAGTAGGGAACAGTCGGGAACAGTTACGGGGACCAAGTACGGGTCCAGCTACGAGGATCCCCAGGTACGGACCAATACGGTGACCAGGTAGCGGGAACCACTAGGGACCCAGTTACGGACCTACTACCGGTAAATCCACGTACGGTACAGTAACCGGGGGACCAGTAAGGGGAGCCAGTTTAGGCACAGTACGGGTCTCAGTAACGGGTGCCAGTAGGGACCAGTACGGGAGCTAGTAAACGGTAACCGAGTACGGGGGTACCCATGTACGTCGACCAGTAGGGGGACCAGTTACCGGGCTACCAGTAACGCGACCAGTTCGCGGCCGCAGGTACGCGGACCAGTACAGGGACCGGTACGGAACCAGAGTAACGGGACCAGTACGGTTAACGAAGTACGGACCGAGTACGAACCCGAGTACGCAACAGGAGCTACGGGAACCCAAGGTAGGGGGAACCCAGTACGGATCAGTAACGGACCCGAGATACTGGACGAGGACGGGAACCAGTACGGGAACCCGGTTCGGGAGGCCGAAATACCGGGCACGAGAAGGGGTGGACCAGTACAGGGACCCCAGTACGGGACCTAGGTACGGTGACCGGAGTACAACCGGGGACGAGAGTAAGGACCGAGTACCGGCACGAGTACGACCCTAGAACCGGAACGCGAGAACGGCTAACCCCGAGTACCAGAATCCCGAGTACGGGGACCTGCAGTACGGAAAATCCGAATAGTTACGGGACCGGGTACGAGTCCCGGGTACCCGATCGGGGTACCAGTTTCGGGGACCCAGTACGGGAACCAGATTACGAGACCAAGTACGGTGACCAGCACAAGGGCCAGCAACGGTACAGTAACGGGACCGATGTCTTGAACCATAGCGGGGGACCCCCGAGGTACGGACACCGAGGTACGGAACCTCGAGGTACGGAACCGGTTACGTTGAACCGAGGTCGGAACCGTTTACGGGGACCGAGCCCTACGGATTTACCGAATTACGGGGACTCGAAGACGGGACCCAGGTTACCTCGGGAAACAGTACGGACCAGTACGGGATCAGTATCCCCGGGGACTCAGAGATTACGGACAGTACGTTGTAACCAGTGACGTGACCAGCACAAGGGCCCAGCACTGGGGTACAGATACTCGGGGACCGAGTCGAGGGACCCAGTACGGGCACGCGGTATCCGGGACCAGTACGGGGGAGCATCCGGTGGACCGAGGTACGGGAGACCGTAGTACGGGACCATTACGCAGCAGTAACCGGGACCGAGTAGGGAAACGAGGTACGGACCGGGAGGGACGGACCGAGTCGGGACGAGTACGAGAATCGGAGGACGGGACGAAGTAAGGGAGGTACGGGACAGGTACGTGGGCTCTGTAGGGACATGGACGGCTGGACCAGCTACGGGGCACTAGTTCGGGACCAAGTACGGGGGACTCCCAGTACGGGCGACCAAGTTGGGGGACAAGTACAGGACCACCAGTAACGGAACCAATAAGTAGGGACCAGTATGCTGAACCAAGACAGGGACCGAGGCTACGGAAACCCGGAAGTACGGCGGAATCAAGTACGGCACCATGTACGAAAATCGAGTACGTGGACCGAGTTACGAGCGGGACCATTACGGGAACCAGGTACGGCGGACCAGTTCGTGGACCGTGTGACGGAACCCCAGTACGGAGGGAATCCGTAAGTACGGAAACCGGAGTACGGAACCGGAGTACTGAACCGAGGTACGACCGGAGTAGGGACACCCCAGTACCGGGGACCCAACGAATATTGAGAGGTACGGGCCGAACGTAAGGGGACCATTACGGGGGGGACAAGGTACGATGGACCCAGTCTCGTTGAAGAGCCGAAATAACGGAACACCAGTTTTTCCGGGGACGAGTACGGGAAACCGAGTAAGGAACCGAAAGTTACGGAACCGAGCACGGAACCGAGAACGGAACCCGAGGTACGGGGACCAGTACGGCGACCGTACGTGGACCAGCCACAGGGCCAGCACGAGTAACCGTACAGGGACCAAGTAACGGGGACTCAGGTTACGGGGGCCAGTACGCGCGACCAAAGTACGTGACCCAGTAGGGACCCAGGTACCGGGACCAAGTAACGGACCGAGACGGGACCAGAACGGAAGCGGGAGTAAGGAAAGCCGAAGTCACGGACCATTTAACTACGGACCAGGTCGGGACGAAGTACGGAACCCGAGTACGGTCACCAGTTACGGGACCCTGAACGGGGATCGAATACGGGAGAAGACAGGAGGACCCATTACGGAAATCGAAGACGGACCAGGGTAGGGAATCCAGAGAGCGGAACCCCGGACGAGACCAGGTACGGGACCAGTATGCGACCCGACGGCACCAGTACGGACCAGTACTGGGTACCCAGTACGCGGGGACCAAGTCTGGACCAGTTAAACGGGGACCAGTTAGCTGGAACCCAAGACGGGGACCACGTAGGGGTCCATGAAGGCCACCAGTACGGGAACCAGTAGGGGAACCTGTAGGGGACCAGTTACAGGGACCAGTTACCCGGAACCATTAGCGGGGAACACAGTACGGGAACCCCCGATTACGAAACCAAGTACGGGAAGATAGGAGGACCAGACGGCACACAGTACCGTGACCGAAGTACGTGACGAAGTACCGGGACCGAGTTTACGGAACGTAGTACATAGGACACCGAGGGTTACGGGAATCCGAGTACGGAAACGCGACGCTACGAACCCAGCAACGGAAAACACGAGTACGGGACCGGGAGTACCGAAAGTACGGGGCCCCAGTACGGGGACCCGAGTACCGGGAACCCGACATACGGAACCCAGTCGAATCCGAGTTCGGAAACCGACGTAGGGACCAGTAAGGACCAGTACGGGATAGGAAGTACCTGGGACCGTTGTACCGAGTTCAGGACAGTACGGGACCAGATCGGGACCGAAATACGGGAACCGATGACGGGACCATTAAGGAACCGCAGTAAGCGACCGAGTGACGGATACCAGTCGGAATCCAGTACAGGGGACCCATTTACCGCGAAAATCCGAAATAGGGGACCCCAGTACGGGACCCGTACGGTATGAGTACGGGCCAGTACGGGAAACAGATAGGGGGACCGAGTACGAGGACCGTAGGGGTGACTGGACGCACCCAGTAAGGGGAAACCAGTACGGGACCAGTACGGGACCACGCGAGATCCAGTTAACGGGACAAAGTTAACGGGCGACCCATGTAGGGACAGTGTGGCCCGATTGTAACGGAACCGATGTACGGGGAACCTTTACGGACCCAGTACCGGGACCCGAGTACGGAAACCGAGTTACGTGAACCGAGGTAAGCGGATCCCGGAGTCCGGACCGGACGTATACGCAACCCAGACTGGACCGATTGACGGAATACAGGTACGTAGACGAGTTACGATGTTATCAGGGACCGCCAGTAGGGACCAGTACGGGGAACCTCGAGGTAACGACCACCGGTACGGAACCCCCAGTAACGGCGAATTCGAGGGACGAACCAGGACGGACCAAGTAGGGACCCAGTACCGGGAAATTCGAGTTACGGAACCAGTACGGAACTGCGAGTAGACGGACCCGAGTACGGGAACCCAGACCCGGGAACATAGGTCGAACGAGTCTACGGAACCAAGTACTGAATCGAACGTTTACCGGGGACACAGGACGAGTACGACGCGCCAAGTACCGGGACGATGTAACGGAACCGAGTACGAACTGCAATACTGTGAACCGAGTGCGAAACCCGAATGACGGAACCAGGTAGGGGGACCCAGTAACGGAATCGGAGTAACGGGACGTGTTCGAGGACGGGACGAGGTAACGGGGACCAGTACGGGATGAGACAGCAGGAACCCCGAGTCACGGAACCCAGTAACCGGGAATCGGTAGGGCAACCGGTGGTACGACTAGGAGACCAGTACGGGACTCAATACGCGAGCCGGATACGAGACTAGTACACGTGACCAGGCACGCGGAACCAGTCGGAATCCAAGTTGCACGTCCACCTCCATGACAGGGACCAAAGTCGGGACCCCAGTACGGGAACCGAGTAGGAAACCGAGTTACAGGGAAACCCAGTACGGGACCACTACGGATTCACAATAGGAGACCGAGTTACCGGAGTACGGGACCGAGTACGGGACCAGTGACCGGGACATGCGGGACGGGAGACCCTAGTACGGAAACCCAGTACGGAAAAATCCGGCTCGCGGGACCGGGATACGAGTACGGGAACAGGTACGGGACCAGTGATAACGGGTCCAGTACCCGGCCCAATAGCACCGAGGAGTGAAGCCAAAGACCGGGACCAAGTCGGCACCGCAGTACGGGACCAGCACGGTACAGGTAACGGGGACCAGTACCGGAGATCGGAACCGAGTAACGGACCGATTTACGGAACCCCAGTTACGGGCCGAGTACGGGAATCCGAGTAACGGAACCCGAAGTACGGACACCACGTACGGGGACCCGTCCGTGAATTGAGTTAACGAGGACCCGGAGTAACGGGACTCAGTACGGGACCCGAGTACGGGAACCGAAGGTTACGGAACCCAGTTACCAGAATCAGTACGGGACCGAGGATTCGAGCACGGGACCAAGTACGGGACCAAGTACGGGACCAGTAGGGACCCGTATACGCACCAGCAGGGGACCAGGTTAGCGACCAGTATCCGGGGACCATACGGGGACCGAGTCACGAGAACTGAGGTTACGAAGCCTGGTACGGGGACACAGTTACGGGGACCCATTACGGGACCAGTACCGGGGACGAAGTTACGAATTGAGTTAAACCGGAGAACCCGGGAGTAACAGGACCGAAGTACCGGGCACCAAATTAACGGGCCAAAAAGCGGTGACCAGGAACGGTGACGCCCCCAGTACGAACCACTACGGGACTAGTACGAGTAGCGACCGTACGGGACCCAGTAGGGACCGAGTCTGGAACCGAGTAGGGAACTCGACGTACGGGAACACGAGTACCGGAGCAGTACCGTGGAACCCAGTACGGAAACCCTCATTGTTACGGGGAATCGAGTCCGGGGACCGAGTACGGGCACCAATTAACGGGGACAGGGTACAGGACCGAGTACGGAACCGAGTACGAAAAAGTAGAGGCGTACGGAACCCAGGTAACGGGGGATACCCAGTACGGGCTCCGTAAGGGGACCAGTTACGGGAAACGGAGATACTGGACCGGAGTAAAGCACAGCGGTTGGCGGACCTAGGTTAGGGAACCGAGTTCGAACCGAGGGATACGGAAAACGAGTACGGAGATCAGATGTCCGGCGGGTTCCGATTACCGACCCGAAGCAACGGAGACCGAGTTCGGAACCGGAGTCCGAACCGAGTACCGGGAACTCGAGTACAGGACCCGAGACGGAAACGAGTAGGCAATAAATACCGGGAACCCAAGTTACGGGTTCCCATACCGGGCCCCATGACGGGGAATCCCAGACGGGACCACGTCTACGGAACCCCATACGGGGGACCATAACGGGGAACGGACGGGACCAGTAGGGAACAGCACGGAACTCGATACGGAACCCAGGTACGGAACCCACTGTAACCGCGAACCCCGGATTAACACCGAGTACGGGGACCACGTAAACGGGACCAAGTACGCGAACCCAGTTCACGGACCAGAGTACGTGAACCGAGTACGCAACCCTACTACGGGAAAATCGAGTTACGGCGAACCGAGTACGGTGAGGACGGGGAACCAAGTACGGGACCAAGTACTAGAACCAAGGGTTACGGGACGAGTACGGAAACCGAGTACGGAACCCCACAGTACGGAAACCGAGTGACGGAACCGAGTCGCAACAAACGTACGGTGGGGACCTCAAAGTACGGTGGGTCCCAGGGGTAAGGGAACCAGTACGGGACTAGAGACGTGACGCGAAGTACGGGACCAGTACGGGTACCCGGGTACGGGAACGATATCGATGAACAAGATACGGGACCGAGTACGGGGAACCGTTGTACGGACCAGTCCGGAACCAGAACGGAAAACGATGTACGGCAAACGAGTATCGGGAACCCGATTACGGAACCCTGATAGCGCTAAACAGAGTGCGTGATCCAGCTACGGGCACCAGGTACGGGGACCTAGGTACGGGTACCGAGTACAGTGGCCCAGTGTACGGGGACCAGGTACGGTGACCAGTACGGGGACCCGAGTCCACGGAATAACCATGTACGGGGACCTTACGGACAACCGTATGGGGACCTGAGACAGGAACTGGTACTGGGACCCTCCATGTACGGAACACTACCGGAACCCACGTTCGGAATACGGAGTACCGGACCCGATTACGCGTTCGGGACCCAGTTACGGGCCGTAAGGTGACCAGTAACCGGGACCAGTTACGGGACAGTACGCGACTCAAGGTTATGGAGGATCCAGGTAATCGGGGATCCAGTTAGGGGACCAGTAGGGGACTCAGTTACGGACCAGTACGGGAACCCACCAGTTACCGGGACCAGTACGTAGGGCCAGTACGGGAACCCAATTTTACGGGGTACAAGTAGGTGAACAGACCGGGGCGCAAGACGGGGGAGCCCAGTACGGGGAGCCAGCTAGAGACGGACGACGCGGACCAGTACTCGGGGCAGGTAACAGGGACCGAGTAGTACGGAACTCCGAGGTAAGGGGACCCATTTACGGACCAGTACGGGGGACCGGATGTACGAACATCCCGGAGGTACGGAACCGCAGAAGGGATCTATTACGGAACCAGTTTTACCGGGGACCGAGTACGGACACACGATACGGAACCGAGTCATGGAACCGTGTACGGAACCCAGTACGGGAACCCCATTACGGTATCGACTACCGAACCCAGTATACGGAATCATGGTTATAGGGGGACCCTGATGGAACGATTACGGGACTAGCAGCCGGACCAGAGTAAAAGCGCCAACGCAAGTAGGGGGGACCCAGTCACGGGACCAGTCAATTGTGAACCGATGTACCGGGACAATGACGGGGACCAGTACGAGCACCCGAAAGTACAGAATCGATTTACGGGACTAGTACAGTTTTGAACCAGCTACGTGGACCGAGTACGGGGTAACCAAGTACGGCCGAGTAACGGAACCGAGCTACGGACAGACCGAAGTACGGAGACCGGAGTATGGGGGACCATTACGGGAACAGTATCGGGAACCAGTTCACTAACCGAAGTCGAAACCCGGGTAACCAGGAACCAAGGTACGGAACCCATTAGGAATGGAGACGGAACCCAGTACGGAATGGAGTAACCGGACCCGAGAACGGAGTACGGGGGACAAGTGCGGGGATCCAGGACGGGACAGTTTACGGACACAGAACCGGTGACCTGAGTAGGCCAAGCCCGTAAACGGGGACCGAGTTACGCTGGGGAATCGAGCACTAGAAACCTTAGTGAATCCGCTATAAACAGAGTACGGGACCCAAGTACGGACCAGTTACGTGGAACGATGTTACGGGGACCCAGTACGGGACGTGACGTGGAGCGAGTAGTGTTGACCCGATACCGGAAACCAGTAACGGGATCCGGGAGTACGGGCAACGGAGTACGGAACCAAGTACGTGGGACCAGGGGTTACGGGCAGGTTAGCGGACCCCAGTTACGGGCACCCCAGGAAGGAATTCGCGTACGGGGACCGAAGTACGGGGACCAGTACGGGACCCTGAGTACGGGGAACCGAGGTACGGGGAACAGTACGGGACACCAGTACAGGTGACGAGTACGAAGGGGACCCAGTACGGTGACGAGCCAGGGCCAAAGACGTATACCCAGGTTTGGGAAACCAGTTCGCGGACAAGGTACGGGAAGCCCATGGTACGGGGACCAGTACGTGACATGTACGGAACCAAGAGTGTAGGAGGACCGAGTAACGGGAGCCGCCAAGTACGGAACCGAGGGTACGGGTACCGAAGTACGGGGATCAAGTACGGGGGTCCAGTACGGGACCACCGTACGGACCAGGTACCCAGGGGAACCATTAACGGGGAAGCAGTTACGTGACCAAGGTACGTGAACAGTTACGGGGACCCAGTACGGAACCAGTTACGGGTGACCAGTACGGGGACCAGTACGGGACAGTTACGGGGAACCTGGTACGGGAACCGAGTTACGGCGACATTACGGAAACCAGTACGAGGAACCAGCGTAACGGCGGCCATGGTAAGGGACCCAGTACGGGGACAAGTTAGGGACCAGTACGGCCCAGGTACGGGATCCGAGTTAACGTAAAGAGTATTTACCAGAACCGGGTACCGGAACTCGAGTACGCAACAGACGTAAGAGGACCATTATACGAGTGACCCGTACGAGGACACCCGCAGCCGTGACCCGGCTCGAACAGTGTACCGGGACGAAGTAAACGGAACCGTATTAACGGAAACCAAGTACAGGATACCAGGTTGGGACCCCAGGTAGGGGGAAAATCGAGTAGGCGGACCGATACTGACATACGGAACGACGTAGGGACCAGCTAAGGGGAGCAGTACGGGACCAGTACGGGACCGATACGGGACCAGTCCGGACCAGTTCCCGTGCCCTGACGTGACGGTAGACTCAGTAACGGAACCGAGTATCGGAAACCGTTTACGGACTGTACGTGACCAGTACGGGACTCAGTAGGAGGACCAGTACGAAGACCCGAGTACGGAACGATTACGGGGACTATCAGTACGGCACCGAGGTAGCGAAACCCAGGTACGGACCGAGGACGGAGACCGAGTACCGGAACCGAAGTAACCGGGAACCAGTACCCGGAACCCAGTACGGGAATTCAGGTACGTGAACGAGAATACGGCACCAGGGTACGGGCCAGTACGGGGCCGAAGTACGGGAACCGCGTACGAGAACCGGTACGGGGAAAGACAGAAAGTTAACCGCGAGACCACTACCGGGTGGACTCAGTCAACCGGGACGGACCCGTAGCGGGACCGCACCAGGGCCAGCACGGATCACCGTCCAGTACACGGACCCGGATTTATCGGGACCTAAGTAATCGTGAACCCCATGTTTACGGCGACGAGTACACGGAACCAGTACGAGGACACGAGTACAGGAACCAGTACGGGAACCGCACGGTCCGGACCAATACGGGAGACCCCAGTACGAAACTCTGGACGCTCGGGGAACGGAGTAGGGAACCGCATATGGAAACCCGTAGTAACGGGACAGGTACGGGGAAGCGGCGGTACCGGGTTACGGGACCAGGACGGGACCAAGCAGTGCCAGCACAAGGGGGGCCAGCCACGTGTATCCACGTTAGGGACCAGTACCCGGGACCGTACCGGGTGACCCAGTACGGACAGTTCGGGGACCGCAGTACGGAATCGGGAGTACGGGAGCCGAGTACCGGGACCAGTAACGGGGACCGATGTGCCGGAAACCCGAGTCACGGAACCAATGACGGGGCCCCAAGTTTAACCAGAAATCCAGTAATCCGGGACACGAAGTACGGGACCATTACCGGGGGTACCAGTACGGGACCCTAGTACGTGACCGAAGTCGGAAACCAGTACGGGACCGATGTACGCGGAACCCGAGGAAACGAAACCGAAAGTACGGGACAGACGGGAAACCGCTACCGGGACAGGTAGGACTCAGTACGAAAATCCGAGTACGGGGAGACCCCAGTATCGGGTACCATTACGGGACCAGTAACAGAACGGAGTAAGGAAACCACAACGTACGGAACAGTTACGGATATCGGAGTCACGACGGACACGAGTAAGGCCCACCAGAGCACCGGGACATCAAAAGTAACGGGACCAGTCACGAGGAAACCCGAGTTTACGGAAACCCAGTACGGGGAACCAGAGTACGGGACCAGCTCACGGCGGGACAACGTCACAGGGACCAGTACGTGACAGTTACGCGGGGGAGACCACTACTCGGGTACCATGTACGGGACCGATTAGTGACCAGCTACGGGACCAGTACGGAACCAGGTACGGGCACCCAGTTACGAATTCTTCGAGTTCGGGGGACCAAAGTAACCGAGACGGGAACGAGTTACGGGGACAACCAGATAAACGGGACCAGTAGTGGACCCACTACGGGGGACCAAGTACGGGAGCAGTACCGGGACCAGTACGTCGAACCGGAGTACGGGACCAGTACGGCCACCGAGTACGGAAACCGACTTACGGGACTGAGACGGATCAGTAGGGGAGCCCAAGTAAGGGACCCAGTAGCCGGACCGAGTATGGAAGTACGCAGGAACCGGACCGGTACTCGGCACCCGAGGTACGGAACCGAGTACGGGTGACCTGGAGGTCCGGTACCCGCAGTCGGGGCACCGAGTACCGGAACCGGTACGGGACCAGGTAGGGGACCAGTAGGGACACAGTACCCGGGACCAGTACGGAACCCAGTAACGGGACCCAGTACGGACCGAAGTAACGGCAGACCGAGTACGGAACCGAGTTACGGGGACGCCCCTAGTGCGGAAAATTCTGGAGTCGGGACGCATGTTAAAGGACCGTTTACGGAACCCCCAGTTCGTGACGGGATCGGGAACCCATACGAGGGACCCGAGTAACACGGGAACCGGAAGTACGGGAACCGTGAGTACGCGAAACCAGTAAACGGACCACCGGGAGGGAACCAAGTTAGGAACCGATACAAGCGGTGACCCGAGTATGAGTCAGCGGGACCCAGGTCCGGGGACGGATACGAACCCGAGTAACGGACCCCATACGGGATCGAAGTACGGAACCGGGAAGTAACGTGAACCCATACCGGGCCCCAGTACGGAATCGAGCCTTTCACGTGGAGGGACCCTCGGGTACGCGTAACCGGGAACCAGATTACGGGGAACCGTACCGGGACCGGAATAACCGCACCTGAGTACGGGACATATACGGTAACCATGTTACCAGGACCGAGTTACGGGACAACCAGTACGCAACCAGGGTACGGGGACCCAATTCAGGAAACCTCGCATACGGGGGAGCCCATACAGGGTAACCAGTACGGAGATCCAGGTAACAGGTGACCCCAGTAAGCGGTGGAACCGCAGTAAGGGGACACCAGTACGGGACAGTAAGGGGGACCAATACGGTCCACAGTACGGGGACCAGCACCGGGGAGACCGTCCAGGTACCAGTTACCGGGACACAGTACGGTCAAAAGTTACGGGGAGCCAAGTGTCCGGGACCCAGATGGCCCCGAGTTACAGAAACCGGAGTCCGGACCATTACGGAAACCAGTACCGCGTACCGAGAGTACTGGGAACCCGAGTGACGGAACGGAGTGCCGAACCCGAGTACGGAACCGGGGAAAGTACGGAACACAGGTACGGCGAACCCATATCGGAAATCAAAGTAAACGGGAGACCGAGTTACGAAGTAACCGGACCGTATAGTCGGGACCAGTTTGACCGGGGACAGAGTACGGACCGGAGTACGGAAGCCAGATAGGGGAATGGAGTAACGGAACCGGAGTACGCGGAACCCATAGTACGGTCGAACCCAGGTACGGAATCCGAGAGTAGCGGGACCAGTACGGATATGTAGTAATGGGACCGAGTAGGGGAAGCCAGTTAACGGGATCACAGTCGGATCCGTACGTTACGGAAACCCAAGTAACGGAATATCGAGTACGGGGGGACCCGAGTTACGAAAGTAACGGGGGACCAGTACGGCAGAGAGTAACGGAACCGAGATACGGGACCAGCCTTACAGGTGATATCGAAGATACGGGAACCGGAAGTACGGACCAGTACAGGGCCCAGTAGGAATCGAGTACGCGATCCGGTGTTTACAGGAATACCCAGGACGAGTAGCCCGGGAACCAAGTGCGGGACAGTTTACGGAAAACGAGGTACGGGAAACCCAGTACGGGAACCGGCATTAGGGACCGGTACCGAGACGGGAAGCCGTTAAAACGGGGAACCACAGACAGAACCGATTACGGATGTGGAGTAACGTCACAGTTAACGGGAACCACGTACGGGGACCGTAGCGAACCGAGTACGGATCCCGTACCCAGGGGGACCCGTACGAAACCAGTACAGGAGATCCGAGTACGGAACCTAGTACGGGGCCCAACGTATGGAATCGAGTTACGGGACCGAATATACCGAGTTTACGGGAACCGTGTACGAGAAACTAGGAACGGGACAAGAGATACGGACGAGATACGGAACCCCAAAGTACCGGAATCGAGGTACGGGGACCCGGTACGAGTACGGATCCAGTTTACCGGGACCCCAGTAGGGGACTCAGTCAGGGACCAGGTCGGGGACTCGAAGTACGGGAGCCAGTTCGGGGACCTGACGGGACCACGCACGAGTACCCAGTACGGGACCAGAGTACCTGGAGTACCTCGGGAACCGAGTACGGAGGATTTACCGGGACCAGGTCACGGGACCGACGGTAACGTGGAACACGAGTTAGGAACCGAGTCGTAATCCAGTTACAGGGACCCTGTACGAATTGGATGGGTACCGAGAACCGGAGTCCGGACCCCAGTTACGGGGAACCCGAGTACACGGACCTGAGTTACGGATGATCCAGTACGGGAATCGAAGTACGGGACCGACTAGTACTAGTACGGGATCCAAGGTACGGGGACCCAAGGCATGTGGGGACCAGTAGGGGACAGGTAACCGCGGCCTACTAACGTACGGGACCATGAACAGAGGACTAGTAACGGGACCAGTACGGGAACGCGGGTACCAGACCGAGGTACGTGGACACCGACTACGGGACGAAGTACGGTTACGGACAGTCGGGAGCTATAGGGGATTACTAGTACGGGGACTCTGTAGGGGGGACCCACTCGGGGCACCAGTAGGGGACCAGTACGGCACCAGACGGGACCGCAAGTAGGGGCCAGTAGCGGGGGAGCCAGTGAACAGGGACCGAGTACCGGAAACCGATGCTGGACTAGGTAGTGCCCAGTACGGGACCAGTACGGGACCCAGTAACGGCTACAGTACGTGACCGGGGATGTTACGGGACCAGTCGGGGACCAGTACCGGGACCGAGTACGGGAACCGGAGTACGGGGCACCATTTAACGGCCACCAGTAGGGGGTACCGAGTCGGAAACCGGCACGGACCGTGTACCGGGGACGGATTCATGACGGAACCAGTACGGGGACCGAGGTGACCGGAACGAGTAGCACCCGAGGGTACGGAACGAAAGTACGGGGACCCAGTTACGGCGTGACCCCATTGACGGAGGGAGTACGGATAACCCAACTACTGGAATCGAATACGGACCGGAGGACGAGTACGGGGAACCAGAACGGGACCAGTACCGAACCCAGTAGGGACCAGTACGGGGGATTTCATGATGTGACCGAAGTCACGGGCCGATGTACCTGACAGTACGGCAACCGAATCGAAAACCGATACGGGACCTATACGTGACCTAGTACCCGTTGTCGGATTACGCGAACCCAGTGACGGGCCGAGATGGGAACCAAGTACGGAGGAACCGATCGGACCGATAGGTACGGATCATTACGAACCAAGTTACGGGGACGGTCGGCAACCAGTGCTTTACGGAAAACCGAGTACGTGAACCCAGTAACGGGCACCCATTAACCCGGGAAATAGAGTACGAAAGCCCAGGTACGGGATCGATGTAGACGAAGGCCCGATGGAACGAAGTACGGGACCAGTGCGTGCGACACAGTACGGGACCATGTACCAGGTCCAGTACGTGACGAAGTACGGCAACACAGTACGGGACACGGAAGTACGAACCCGAGTTCTGATTCCGGTACGCAAACTAGAGCACGGGGACCCAGTACGGGACCAGCTAAAGGGACCAGTACCGGGGACCAGTTACAGGACCCAGTTTACCAGTGACCGAGTAAACGCGGAACGCCCAGTCTGGACCGGAGTACAGAACCGGAGTACCGAAGCAGACGGCTATGGGACCAGACGGGACCGAGTAACGGGGACAGTACACGGGAATAGTAACGGACCGGAGTGCGGGACCAGTAACGAGACAGACAAGAGGGTCCAGACGGCACCAGTACTGGGAACCAGTTACGGGGGTACCAGTACCGGGAACCAGTATGGCGACCAGTTACGGGGCCAAGAAGGGACCAAGTATCGGTACCAGTCTACCCGGACCCCCCACTTACGGGGACCATACGGGCTCGAGTTACGGAAGGGAGTACCGAGATCCATTACGGAAACCGTAGACGGGGACCAGGTGCTGCAAGACAAGTAGGGGGGACGAGTACCCGGGACCAGCTTCGGGACCAGTACGAGGGACCATGTACGGGCCAGGTTACGGGGGCACAGTGTACGTGACCGAGACGTGAACCGAGTCGAGACGACCGAGGTACGGAAAACCGTGTTAGGCAACAGAGTACGGGAGCTGAGGATTTACCGGAATCAGTAGGGACCAGTACGGGATCCTAGTTACGGGGGACCCAACGTACGGGCACCAGAGGCGCGGACAGGTACCGGGGACCAGATACGGTACCTAGCGTACTGGGACCCAGTACGAGACCCGAGCTACCGGGCACCGGATTCACGGACCAGCTAGCCGCGCACCCGAGACGGGGGAACCCAGTACGGAACTCAGAGATAACGGAACCCGAGTTGAACGGAACCGAGTACGAGCGAAGCCTAGTAACCCGGGACCATGTACGGAATACGGAGGTACGTGGACCGAGTAGCTGAGTTACGGCGCTCAGAGACGGGACCAGTACGGGACCCACGTAAACGGTGGGACCAGTACGGACCAGTAGCGGGGATCCCAGTACCGCGACCAAGCACAGGGGCCTAGCACGGGGGTAACCATGTTGACGGGACAGTTAAAACGGCACCAAGTACGGGACCAGTACGGAACCCCCAAGTACGTGACCCAGGTACGGACCAGTACGGGACCGAGTACGAAACCCGAGTTACGGGAACCGACTACGGAGACGAGTATGTCGGGACCGTACGGCGAACCAGTACGAGACCGACCGGACCAGTACGGACCCAGTACCGGGACCCATATACGAGCCGACACGTA
+S4_13395 id=0.681210 len=18305
%#'&%#-*$%-)(.'*"-$(&"&(&,!-&)/),'#'.&(#.'$'+.$$-++%$'%-&)$+)$+,!-%.%,')*!#'!**.(+)"%$!!$$+-+)*+-&#(%-#%.),%&,%%!%&%(&&+.("+'+-*$&%,*+$)'&!,()'(-,%+()(((.)"+-+%)&$&(*&!"&$)&*(),,'+,&%&&%+())-%&").+"&$+-$$/+$$!*!-'*%-!#&."*%,+.#-&('$,"*,,."$/"#'+&*#-'#&$+$.*&$#%*)'+&%.*+#*$)#%&$,%+%".#$%,.'%)(&#)*+.+-#''#!+(%.$&%%%',.)$'&$%*%!+'.)&''-('"'#"*,&%(,(',*%$%)-))*#.)-+#+&!,$'--&)(.'"')$,,''#''#.,,,)!)&&((-++%'&)%+#.&%#,#%*.'*,)(,!$$&##%.-)&)$*+*!#.$&'+"')+%%*'%.**',!&"!$*($')$..+%')&-$'(,$"#(#,!%$+&(%'((%$#$$.-!-)-'#,&*%')'*.+%*(-&!,.)"#,('#.**$$$").&'.$$."-")$'!"+)'--&*!&'+(#!"*&+$%",)'+"".('"%#,)&%*&.('(*,.'$+!-$$().."%((*$!($&$%"#--#)*"'.$,$&"#"&#$!!"*(#),,+'$.'#''-($$+&'$'-)$",$"..*&.,$+*,#(!(.%'(,(!)+,)'"**)#!-$&-&"!#-#*%!".'/,&&'$$#"($#$%#''-&%.&'$-#--(,*(*(-$+'*$&-).(#'&"-*&)#.(),,(+-#,''"(-##&&+&.-$&%##!#).*%,--)$'+*-,)$%+-*-)"#-''&.'((,)%*-$)#!'$%),-!!,#-,$'"!-'.+&()&,)*&%-+#'#,%!$*$*)!,"#,&.,'**''-#$*,"'!(),$)&.,#-$%&*$#)&)*''$&(&'',!'*))%)"%+'(&,."&'",&+),!''&--(-*%-*)-#%"*-'($(%*)#''#%,),(-#$,$,#.'&##',"/**-"+(($&#($*+$),&,&$&&"+'$%)*%+&($'##*-%*'&'*")"%"))++,)!#)*,''&/..#",&#*+-'$)-+".,$*)-"#(#.&/+&.*,&-)("+''.'+#-".##(!%*.+"%+,"#---$-'%)%#.,%#"&&$(!)$#(%($#&)))-(-+&''*)+-.-.%%($*$'$#++*%(%-$(+'&#.$$!!#%%#!)"$*$"-%$"&'(**#),!"!&($&.*)$$)$$%.'%$$(%(%-)&&%%&*%+%,')('-+('&,,,("&&*%#.!(&*+#'&%#)++()"&$.*&+.$$*)-"+$"))+,#(%&*%$,,,.(&$&+")"(&",&$#,-*.*$'&)()$&,&*',!#/&,".(',&*%(.&'"(&&$-&'*%-'*#,-#(#'-*!#&"'.!,'"'$,*#%(--'*.--$"'&('&--+.&#&$.')$%*&-$)#"*'.)+&'#%%,$%!(!$-$#*$&&%#*.*,!,&&,#"&(&"'()'#)*'$$$'((,*%++'$)--'&.$%%$+%-)+-*$$-&",&-()"&#)&&',#,"'$&)')%)-%-#$*$'"'!"-+)(#!&"#$'.+$)),-%+(($#$+))%)*'+*".+,&%",%$"-,%##&***-%,-.#(")*+"*-,&(%%+(*"&"(-'(.',$&*(,))+&+,&)",$&#&%-+*%).+)-)'(&(#$$*!-,$%'%#"")*&-,&$(*&#)&.(+'*.!+%-(-$'$+')&(!*'+.&(*,("+#$%$'%.+((*,!%.,-+'%'%'+)%'(--*&'&)."(#)($,(',+*()&".,$.##,&&(,,%',%'...#$-.!*".*'&))*!#+,($,'&#%$++(##"%!"**#)&.'&(#+*%)%,,"$%+'''&))'%!(!""!,"'%)."*%#.#%&*)$$'"!#"(&"'+#(+!.))-)#)%$.&-&#-(*&!+(")%##"%"$)"$&.,'.#"-,!%&.%'&%$%*$-,+&',-$$#((!&('''*,##+&(("##+&'!$+(&*")-)$%&+##),!+%$$-")&)&$"&&-%#.&+$%%'*'!,#$$,(&",'*$#&&($#''(&$&)("()(-,)#"("'.-&)','$&)&$'.."*))*())''%.(,"&+-#+&%(!*(+%'"%(+%$,,+'*)#)#&*-'!!%+&$+&,(*%)&,+)&!(#((#"$%&%$-%((#%#-".,+)'#')$+)*%()'(-"&$)%-($!+%%,*#/%!'+)*.%,),,&,+,$%',%&!)#*.&&'**,(,')##,#&*-$+,*#)(&!-"()&%",.#)-&')'!+"&'.-%+),'($,')&#&!($&"%%&#%%'")-.)#),#')""'(#$#'%!#)"&%,-(''!#)%#$..%+)&-'&&"'-&'!%.+)+"()((+$-%+%%'"%$%#)*$'--!.$%#$'$,"++$%"''&#-++#'-+$"%'%)'-%.$*$&#%$&!)$'!&&##-)!*$'&*#&'"*')'!!(&(%*$((($%*&%**&-(#(*'##$-+!.,.)+)&.#.,&)-#.)$%(+'#-%)",(,$("$$,,.#&%!#.-)%(,)'!&)%")%.$($#(+)+.)-)%%+%(+(')(.!%.%')-+$'%)"#%$"%'*)*,%))$#(,'((&$%&#'.)+)"-%%%)+&%,%.'%$/$&'$-$&).("-(--!#$%,-&$()-,'(#'+(%.,'&+'&$+-#*-)%(''%"".&.'-&%)%()#'&%%#("&"#**&%(.!.&%%*#$#'-'&-.,)%#+,(#$+,-('+''+$$#.%*'#.-',$*,,,)!,+%(!('(&/&'$(!(%+"&"%-.(!+*$++"-"!--'+),(##$(!')"%+&(*%(,!,!#--(-$'&&#(.**'$"%-#))-#*(+.-(.#,%!#,(,%"'(*("%)'$#&%"-")*&-)*",&".$.,,!%&",,&,%!."%""*""'!*#&-',-#,"-$&*%(("&%"&&+++,$,%$.+"-)-,+*%"%)+((+-$"&.+'#-%%,..++."'-%(.(*%',#")")",)$%+#)'%.&'*&($*,#)&,-$"#%!(%&&.)*##*(&)&-+++%,)'(&%,(&%%&$($(.+)&%)&"%+%"-"&+"%%&)'&&+)#.(%,-''%+#*$,''%&("')&#&%),*+,$*+--*&+!(%**$(('%'-,'%!.))#*&%-'"+++((,'--(&'#%$)$!'(&'(".%+)()*%+(%&**)!%"-&&"&(#,#,**$!%&),"+(,)+(&($&#$%-.*%($'""("%(&)($"&#$!.+*,%'/.#--,''"%('-.*,&&+!'##*'',(%$&)#(-&+&-$#+)!+&$'#!,('&)%,,-$&)$-$#%%-$("&(&'$).&!$%'($,'""/'*$&%&)!%$-*-'#&*#&&#(*.$%#!&(/$%$.!-."$$.-#*&-'(%$!%$.+('+%"#(**'+*,#%"#((,.'),#$-*'$-"*"&+$%'.#)).+-&$'-&,*"*$)(.#,,((-%%'($'*+'"(''+%*$#,*,"*'&,)$&$()-"("-%$&*#$#%($.*".-#**((&#%)*%,#,%"'''#"")&)-%)!*($&)#%"%(!%-,'*#-(!*+",&+'*#'*$&-)"&-#)'#.*&,,(*&-,(.)%+'"%-+(%!'+&&-&%&*!,-$*)!)+(!'$#-%'$&&-+,*'%.,-&-$(+'*+'&(&&")'%%&(*",'-(!(-$&*#*%.'.#!**&&"'%'+)&,*)%.(*-'.%#'&&+,+-$'-!,",#,(#(&$$#(*,&$##*)'#&*%)%*)*.!(#,#)'-,."&"%$%&%!&#'#$+,',+(,+#&!(!+'$+-!-($(.(-*"*-%'"#*$*(",."'%$%("(.%#$,&#"#$,&,*,!'"+"#)&(&"+"!$).'"'),)#%$-*")-"'.,,'+).$*#%*%%)#*$$($%%&(&+)%,"$".*,-!&)#"&,%&&&--%#$),,.)+$,,,%$('#--+".'%),.&#-+%#&#%*$"&&%$*'"$.&'"+&-"&#",%%.)+&#"'*'+#+&$)+&.$+"!(!'%#-('$''!$)'))%,+'(.'**'-,(%+."#'.)%($+-&+*''"%(-$('-).,'-'%$'+)++!*!#.+-*(*#$%"&%(&-,(+#.'%,$&!"**&,!(#*#,-&(&+,%'##!-%&&.$.&(&")+.)*'&,*)(%&)')#.+(-&-)&&.".&#.$'(-%&"!&.)*)'%#'')*$-*-%*!*"())&(,*-%")%"&%("+&-+"#*#&&,'#%&"%"*##.'$)"%&*$'')&$",.$%*##,+,%"-%$"%!)%&%*#-(((-$!#'%,$*.''#*("""'*''"$##&"*%!!,!$&%!)&%'-'$+,)$)*!##%)%.*!)*"'#(%+.-#%&&"-(%-#.)$'++"&##'+$,*%&.'*$)!.&'$%+#'"#)($,.*.###&&$"+''-#'-#,+*$(,!.!)'),'%,.'%%+(&."$,#*#)-%%'+('+-')(%','&),"'*'++!.%*'%)+)#-()"(')",'-&-(("&%,(',,#.&')''#&"#'(('!+..#%-"))&'(#*+%*##*)($&)%,(-*(*)-$.#()#%')'('#)-%&-,(%,$(&#$&#-%*$'%$()!)')&)%&.")+..!$,')",$%'-(%("-$","*%*&!)'.,+%,!$)*%(-(&#'*"-.)!,*#'&',"&,-#$#'*,$.&&-#-+,/*,)*"%,("*+#&)'-#(&-&.$#&($*%&"!%)*%%-$&...+-*)'%#*""%(,++(#$-#,.%'%(*#-"&)/#%+,''--+,*.))#%()'(%!$,**"+#+#)'.+%,*#).$()$"$,***&*%$,!))'+#&$"$,-&!,%),-,-(!.%#%&&-$+($())"-"%*.)#$."&%-%,%)'+%#($*%#%-+-."!%#%%)-%*%(..*%(#&-%.(&%!*(#)-(!$*&,%$#&(!&,*!','*(&)!"/*)#*("*##&.,*"$("(&+.$!&%-,..-.!-'+'(,-*%-'%%"%*)",)%$&)#(.',.'*+)(&,'&#&%+!&"+&#)"+,,+&'+,%#*"-%-)(&&)#*,#'--,&-%("#&'($%$%!)(*(%$!*$#+"%$%)#",&).,!&'(""$&*,#%#*%$-,%&(("$)#"&#&%(####$&(.(")"*.$!,(+&',-'%%-'*(","'%%$%),%$-&('++'('%*',"++-,*%!,,*'&."$)*'"%&$&#+-',&%-'&,*#'".%++"+-)&-(,(-!-&#,,.('$&-&"&-#*(%')!&%+),.$#%(',**#%$&"'$#,#)(*%"*%,&''(&"*#)(.&%#!,-"%#(()(#*,(&(&#!$++&,%%*%"&+"$"'-$''$"..%(*+--#$*%-,)()$)!!#(")-)'&,'&$+%--$,"+,&!*")*%%")+""+-"$)%*&',)&+$$$%')&(!!%%**%'+&%"&-**.#(,#..%$#*)#&$$+,,,-#&*,!*(%+)%-$'&$&**.+*'$%)&',$&%."#%$#)!+-%&&)($+)'-*!'!),,,"%$"***%+&)-)-%&.$)",#.-#'$#)#"*&$#'%,*"(&!()%!-'!!*$&-*#$$,%$-!,&'#($(#((+($#'##,+*$*++%.-)(,)+('$#)+%(*&#--)!"'$+"%(#(,"('%($,"'.*)(%%%+($&.%$$'%,&++$,&"'-%%.#)"+$*'('"#-!,#()+&"-()+-*%$&.((((*#%#*+'$"-"('**$#),++#-&%%&,-%+-'"&($-+*$#%*-"%&"'(,('%%+"'"&%.,*&&%*#!-%).(*%&##-!*&)%$*,)"').#&*,#,,&(-&'.!,####,(+$#,&+-,+$&#-$$#&."%%(%%++'$-*+!%%$!"#,+%$%%+-+'-,$%+,.-)+#,'.$#$'#)(%,%,#-)"*'%(*))(-(#**!(!()("+$.'&$%-#.%$+*"')*&(%(%%-,%'---*(,#%',$)+(#($('%%,,.+'%!&!,%%+%,),$#)+#+%)*-$($.&&.$-)(*)'/!$*$$,+*-($)$$+#'%*'*#+%'&,#')',$$&'&()%"$!)&#$-'.*(',$%)&'&".$%'!#.(+%!.)*&+-)($+"!.%%))+"*")&."(%)%-))()")("$#$)#$%&#.($".$&%%.(%)'%%&-)&,".-%)&,-,!$*)%"$"&-&-#!*&)'*),(*+,&+%(*')%$).)$!!#*%%*&"*,#&!"',-,)$,**,&%'&-)&%+#-&$),%'&",''*&**)(+)&#%#((,%%.*(*%$-!"%"-+(.#%*#$,$(!'*-#-'*$(%%-'$.'(&+*-+$(+**)-))+*,$.&.$.$'%""(.!-**$'%+)"-%.(-#%#&%(&+-.)#""#/#&"$).'&)-%--$$!,&*'-")$&%*&&++,$,",,-(++%*!,-&)%##)(+,.&-&#*(.%&&')%&&&)'',',(#%!+-#.,-,)!-(!'%+.-"+*''$%)"'$$+(%)*),(,,.*&!)*#.!#**!'%)$&"&-$-$$.-#("%-$"$++&+#(',-!',$-.-+$,&&#)-"+(.-)$*&(,''%***(),'-#)'%&%'"#(,#-#+#$'"#&(+#%'"$*($,()#.-!"'(%('+%.","'$))*$#+'$*"")*(()(!'')"&+&&$(&*$&&%.,&,"'-'#)&").'#'#$-%.)'-&('+%+#'(.(",''-,(($"*"'%.+(-),%#%'''(.!,-"+*()-((#-)$'%((,'$+#*')(-('.&+&',&'+-"-,"')%&(($+%&,#&&"-%&)""(%'"!*'%-'#-,''&*(!#'''-(+#$+(%%,#&$,#(")).,''($".&('$+!%.)#-.'###&%%%&&%&$,$+"%+,#(".-+%(-((**!$'%-+*%'',%*)*%+('*.,)+#$!*""+#%%+$!,,,%*#&*$*,+((#&$%,&)-'**.%."(&.*$)')&$-""".)()+,'('+&''*,,+*(+'"+%+#'!((&''.')#!$.#)'!&),'&*%#,&&-'%%"#*-)!&%$+,#.,.&(((#%%*''#%&)()")*%%++..-,'-*&$)!+*)''#-)&))%&#.&')$()'',%&-*!("$%+)#%-#"(,&%%%*+!%*!#$$'-"$.))'%*''!&."%#+)'#*('.$,#'"(*'#"#"(+,&),#($'*!'('##%'*()#((-"".(#,*%%%#&$+%&#*-)-!&&*-&%+%'$&,+,*,$&'*#+'&#)$!)'&&,#))+&.%)%#+($!*(',*%%(*%*+&#(*&*&!'#."$)$%('%)&,'''+!$'"&,%-.!.%&+,*,"*'*%$"-)%&'#((+")#%$&$)*-.'+)*$*'..+&%+*)%-(+)!+#%"-')"'.-()(+&#"*$'!.$"(')+')()%(-+(.((,((,&&-+"'%''(,'&,$,'+,#*+-%*-$.',+#(#..)%''#)%'((*&%((&%'!&*$+''(*%!!%.)#,+,#*&)&+%%%*)#*&,&#&#"&'"&,"".-$+&$##!%'%($%*%*-),$+%)%&&',''#")(%#")&'.!"%")%.%((&,$")!'''%.-%((&($,'$$',%#$),#$(&+.!,,'*#,)+&$$&'&,")*-%$'%&.'.&%.&#$$$*"+'+!*#,'&-'"+,$)'/)!.!*%+#,.&("#"$(#&&)*%##&&*',*'$"'%*$'((##.%#$$)(.$)$*+!#('%%-(*+,'&*"%,'%(",+*++%'&"(*##('+,&,)%%&!*"-$&+)$)%.+)+#)&,$,+*!)',(+#.#$/&+*$'%$+).+*&+#%.&"!&"'())*.$%%(#!#()&*(),%#),&.$,%(-$#-,$,&,#%.('-#$.+-,'.+%+#'.'&$),#,&&)%(.)',-."-#%*(),-&!&*'#%'&'*+$%&+%!$#-((#)+-%'+$$#-%('(&+&#((,%&#*-).'(##*&+''/*)%%(*,!-%(',&#((+,(+&"--#%*)#"*+*/&&,')'.'&&(.%&,$(()+++',&),''-'$**)&$!,)*&-&$#(*"(-,&&)#&-),,'"(##-&'"%(').*#-+#'*!$&"*,-*!##'!).%&'&"",*&-,%"&##&*(++*"'"$.(()(,..$"&&#+*%"+)(-&%+'#,(..#.(*(&--+'%#$'-.#&##%&-.'&'&&'*'+)(*"&),-.%'+('%%-&)%*.+#'''$$%!,,*$%%&')*!+&+")*,)&(('!"!'%(+)'.&*("##"$'+(),(*"('($.((.&$((-&()'#*-)"#'-"&#"##).",-!+-+",&%$$.!+!.*"''%(.'.'.)'",&'))$",.&"$*&%*&"$,-(+-%+*%*$*(-#((&*&&#+-,$)-!"#$.!%-&')*$$+##.*"*+,'-!&"%"&)",$&-(%%.%%''#).&+"#(-)+%&-(+)'$'*)&$-"((&"",""%,''&+"$!''-$"&'.'&&"&&(.$!&"#,&!&$+'%&'##+*)$(("..#*+*"."'&$%+("%&(."&#$$$,,+'-"&.+(",'#%$*"&%')!!)"++%"%$,"*+('#%&'$$)-'&%,!+'!###.%'&.)$"*&-(,%-%,#%#)*&,*(%**,$!')&')!,**+$(&',%)$&)$$'%*#&,&($-()*#.)$)+'$$%',%)(-%++(-%%"*,%.&$*-$)#,+.'!$-$(-+,$(-#,$&$%)'(&")%)%,,&#%+&(.%'+'"*+((&%+).)$&"$"*'%(%'$"$,"(*,&#($*'(&)(%(#")&'*$+)'$-('+*%,-".,%!%$&%#*%%#-.)!$&*".)*&**#-)$,',-$"%'$%#,'&+''!$#'$##.$,#",&.&$$-)'*$#,")%,'*$)$),)#,,"!-%*%)%',.%)$''$'*'-,!("*'",)-%!$(*,'"%&&.%-++$(-.'*!&#"$$%,'--*+.'%&$"+'&*%(%-')*$)#**).%'&%+'"".*'(+)"(*%#.()(%$(+.)&%.%"$&(%(-"&&$!,#"'$""$.*+-'#(%+(.,#&*+%.$!+&&((&##,(*+%'"#.)+*(+-!#)++).+&&#&%('&+&#-)!!#.)-*(*)($)%!)%#&")**((#&-)%(.%*.%!!$*!*%!&))"+"#+#,&*%+)%&(%*)(%*.(+%)"%"*%$,**),-"(($$,(+')'&.'*-%"(&*.'&#%#$&.#+-*!'--)+('(-!%+%**"+)'$$,)&(!-&-(*+-)'"$.)'.-)).)*+%$%-&.%.%$*)#&&&",')##''"##&+,$'',&$,)$'$%#&)-,*")*$+$*$%$('&+-,.%#,%**(.,%."'*##+&-!'')*"##$&&$!,&&'"%,%-,&&-(-*%$"#*#$+$%&.!(!,-*&&.(!%*&*)%.*-$#"-%)*,-&,..$+$-'.%&.(#*(*%(+$+-**(%*&(#)-',"*#-"&%"%&*-'",%*)(**#"+(!-$-*&!+'-,-$-)!!(('+,'&+$'.%$-%''((,(&%-%)'($'-#"*-$$)"'#!"*%&(*&#(.(%$#-*"'(-).)-$,())!)*"%')",")$.-&##&$*%'*%'(,*''%('".*(-+'!+)&*&'$$"-'-(,&()&"'+--$-*##'.'$.,%&*'-#+'"*)+%)$%#%("-,*".(&'$&*'"'&!)''!$&+'"&#$"*$--&&--*+'"'%$.*(.'&.*,+(-'"'(#","'$-*-*$&$(-()''#-'$"(+&&,,..$+!*&%+%&")-.('('%"*#%+*'##%%')()!).%&)",%.+$&*)"+&$&'$-!*()$'$$($$$),$''$)!!%*&'&)$%-#%)!.$((&"+.)+#$)),(.,-"%*'$%##'&#,*%&)++,"+)$")$-$$"'$*$"-($&%.('"--*!"(#,(&%$(#&#(-&%''(%"!!!((''+$+$+-&%)*,($%*,'''%%")&""+))$&+&(/('+$''#"#*+%()+-.,.%-!'-!*)+'$#!"-((+".%+"(**&))$)%'"+*"'&,..%%&)-%+-$""%),&.,%('*!-'*.%&#%*'."#,*-#.%().&&%%*,#&$"-$%(!'$*.$$'&-+(%"($&.)"(%$#"''%",-"*##*&*(+.)*$"%,'('((#+!$),("*&#&()'%&&#%!$.#,,.+%,&+!*!!((")%''-#%)')#$!%$'(&&%!)!+)*,)***%"$#&+&!)"#,'+&-%*&!+!"("#+*,'+(-&*%#'*#-+&&.+%%,!)."*-'$)(&'%%'"!#(,,**."".#&)"&%&,&%%)%$"#,&).**---()-!*$$',($#%&'-"'$$,&#$%&&'%%&%%$%&',,)+&###/"(')+&&$.*!)$%%-)('%-%'#.%,#+!#$(.-*%)%)&&%-*(,)(%%')+(!&'#$,)++("'$*#,'''&('((!%,'%!'%&%,!#',-!-+((-'"+%,*$&%$"("&(,"%#-%%*#$#&%*!$(&&-$"&!+)')",(-%-"$(*%#-''((,#")'%%.+%"+*"+-)"$-#+,,'-".+)'!#.*.%..&)&$&'&)#&$&-&$#-'(,%'**))&)%&-#.%)),!%%+-""%)($$"",)#((#-$')'.),))!!%*$$#%'+$).%"$$$",*"'(".'"##*('!$*(&%#$'#'#$"&+#('("")&-!'"*"("))"&.#-+)+-$,%.!-,-%$("$-()..(&.$'%'.)%)&$,%($),!)()(),'(+',&-,-*(&*#*%(-($,*$##,***)-$$*!#'$%&.,+*&+&%".-%)#&+.'(*#,$*((,",$+,&,-,)%(#$&&%($&'".((##%**##(*$+##!&#%-(-)!($(!")%)#'$+$-%*,"-!(&"%,*(&!$'(&&)(.-('+*%%!)!-"%%!"%%-&,%&&+&,'.*'%%%"&!&&!+)&'-''(+&%.+*&-'(&'%%"-.'%.%)%")+,-$&'&##)-%&#*.-(%$)$(',,+'$$'."&$'%-.'(,*!&+%%&)-&#+,%))#,-),."&+*%*$-.%%*")&+(%###"'#*)".%-%+)&'.'!%')(("(!,#*!$$""&#*'#.*&"&#(*)..((,#)&#(-,('&.(+'$$((%%*)'))#()*$,$$(%+,'*%-!,"'')'&-#'-)'(#"($(&&#&"-$-.'#!*$,($%%#'+"&$!-)('(%$+*%"-("*"&'))&-%*"#'''",+"&%)*'++#,!"$+#!(+/.*%*"&(-%!*""#*#$(-&""&"*#$"*%"""&+'#*&&+#,,)(%,#((#)-)*&++$'$%-.'&"$+'%%*'-%,"+&).'(($%*%($($',.-.(!+,%(%$,#%(-$-&+,(#--"#,()*--.'.+)$*.','",&%++'-*!,)&&+.&,-%,*%)&-%**$$#".'#+%#'%%'**(""&%&$&!$+%-'.#!"(.-"*$'(',%('-..%)*-"+#,!")&()+')*(*-&$+$(*#,)"$*+'+-)!%+*.*&+#&'""&'*'")$#'$("%*(".$#)&#("-&(%-)*)#&'#"")()(.-)(")#'$,"$%.(-(,(%,!*)$#*$"&#"%+.*&#(!.)$%(#'#&",$"--)".$'%%(!#'"--)%%%,-$!&*&'(.-'$--#"&'&%,-+.,")#&%.%'&*$+(,"'*%*)',&%%)$(+#&-#,$&,,%((-%'*+%+""&'-#+(+&"('+'))*'+)*&()"!*,%+$*'*))-(&%-!#*,#%..'(!,')$,*&',)$&+%.)++%*++,)-#)-!(-$&-'$.-#+-&'))'*+).('#&'",'%$&$'$'%+'#-)")##%!+")#.'%*%*(#!#((("",,-&#(,+".'##(+%-(-(,)'..),().)'$(-*("')".#$--,+,+***$)*$-%,$%*&$('#*.+#%,)$#'%-)&.*.!'$-"&#&#-."+##+$%''-!(+&-+#$'#$.%'&#!#%%#&"),,,#)&&)%(&(&'+,)!$%-)&-+"$%+%-)(+#$''-+&%&+##!*'$"&-,&"-.&!%))$&++%)"""%+$'-$(+'-(*+-#!$'!,#!"-+,#+-(,(&"&))-+#$&$.%%%,$**%*,(,"".%.#%%()%$.#)#'-'#$$.&-*".-.$&&$*&%!"-!&#%(+'&"&'%('"'('+,,!$$$%(-%!+)&%!%&,*'$$##"()!"+,%+"'%*'(!'*"%!&(+%%%--*"%%!**+(.-"'(+%###-(&$.."%))(-&(%*..'+-+!.))#-#)!&$&($*!,&)&%'%.'%%%&$#&(+&"*'*",.',#(,&.,*))&-$*(''$!-++%$+$.$"%"$')",.%.-.&-#''"-""$$#,'&"+('#)*-%##('#(+,+.))&.)+#,%!%)&#%$$%-#-")(-+"*!-"!,'.((#&-'*(%$$'+,&%+,!)+##(#&)$$'(+"!$.(%-#(')%%&!'%!#&#%.*..$$&+!,+(,*-#,,(,'-$'*,,,%"')"'(%"&''$)"&%%''#"&*%.("$'$',"'--&(-+"&%)'*%#(,%)(&.'%&**)(-*.,.*&-,,)$''%#(,.-'-),+-('.$%,%)"$&,'%)#,'-#)(&)+$-&")*.')(+,,+-'(#%$++#$*)-%.-"#+,(&--$-'.+)+"#&#+.%("-)"#-&!%-#(,-'*#&$''-)-&&'#&*!+)),+-$#(#)#.-.*!*.,$$''($."(*"%*"('!+).%)'%''&-'**#.()(*)-)*&.&'($,*$+!,(*&&#$-,%!+$!*.))(*!-',+*),.+,#&(-+*#"-,#*"#'&#&%#+$))&&*$*'",%!+'"$(/&.*''(,(%#+#**+"#'),&"""",''&!',!)%''(&,"(%#%')+!%(+*!+#,%*))('!$#,!"%$#$()'"%$&-$$)$%+&.'!)&'*-!-."$$(&*%$#(,('"#&)'$"'"&''%'+%'*&)%!'&)"(#-#.!$&!"."'&,%$#,!,%+(.&.+,(,&##-##',(($)&),*-&(%$,&#$,#&)"$"$*.*-'-'&#'*#)'&!&)($!++,%#(**)%'..*+%(#&%-,!*$&',."$#-"*".'#.*)-"**&%$#.'%.$(*!+""!#.'$&&"-&+-)$)*.$-&$&+'-%$-'%!,+&%"#""!'$-#,($++#$%-.,,)&##*'!,-)-!+%"%-%+*,.(.'")&",$*%*$&$$$'-'')$,%#(&$+&+"$,*)%($!$.$((,#!#!,#(*&"),%&(&',%$'-##.%"($)+*.-#,"*(,+(!#-)-&,$.&#",'+%&&-''-&(*+)))'().'),-''$$.%#+'&#+)*%-'$.)-.%*'%)!%-%+*'&)$'')%.+%!&$'!''-'+)!&)&*-!,)(%)*#'*&+,!,)$*)*$((%(-"$%+'%.#$%$%+)&'')(*%+&.,.-&%+*%"&'("'#%"%.,%%&,.-'(!%-$$+!')*$!#).$'&-(",&&(#$,""''&&)((*&''),.),$#('))%'",()".*(&'-#&,+,%)+$(+&#-!(-)$&#'%#(%,'*)&#"$%$-.&#-#'&",%.(,*/)%#$)%)*.&,&.$.-'%%+','("#-..-.)%&$$*$-!)++.&*$$"(.(+,',.&+-)#--(#$)&')&)+*.'.(,,(&)*$,*-!.(#-,*"&%,,%$'&.,--&+#%',"***)"*($)&%",(-*$($)'!-++"."(##)$$*%%%'%*+&(**(%$#%"'$,%)$&$*,$-#-#&$!*.#',#$&'&(+'&&*)"%").--,&,&,&+.&-'("!!-+-!)%+&.,!&,.*"$'-.%&+)"!')''!%$%&-(#(-%.*#.+$($''-$)()/"*',*+'!#*.)%$!).((*#'%!%%$%$-%))+'*(%""!!#*$#%("%)$+!$'%,+(##(","#+*$-+$!'+(#&.&"'("#!*.+)&$$(#()%%$(&&(-$'"-+,.%&"&&#*(+#,&"%-")(%#!!#$$-%#*''",-+)"($,"!(#*'(%',,',-*&$&-.,&&#%$*&"-(#%#$%!#)'.+%#""(%'(%*%!+)$'%%+%&'+!+$#+"&)&(*'("$&*-****-)&(&*%$,%""+,#(''#(%*)('),$&(%)()$%*,((#.)$,#,)(%(-&$+--)')(+%#+,&()%##%#&'((($-#&&*$)!$#)+-%-*$*),&#&*(#(#*#&.$#*.*%%.+!+"&"#-"+#'-'(')#!*$&*,-*$)&(-$(")%#,$()%"-+.++"+'%(((&)+&&%%)$+)'$'"+#&#&'##)$.$-'"'%*$"#,+#"'#(*))("&*'$$)-&!'%,-##.)$'(%.),"$)*$'+.(%$*%'#%.+$,-*,-%%'))",**'&%%$$-,,)$%#,)-&,$'"##*(#&".#"+(&##)!&,!-##+#"'.).!#'+"(##(%$*$.,&)$*'$!.&+"+)"'.&-&$"%*(,&+-(+*%+.&)$&.-("*$'(#$'-&($'"&,$%,&&#(&-'-%'$$,-&),--%#)$&$)#&%&$!,*(()"+.(.(&$!""%%+&..##!((%&#%$,#"'$,,#,&,'+%("(%,()%")-.%%#(,!*%,%#*&&#%,$-*%'"'&"#&*"&!*,--$)$"-&#*")&%,$!#*'!"'#,$)()*&()!(,',(((!$-%"'))$#)))#+$+&,'#*'(%&.#%(!%+%-%+)-%*-%*"#.$,$"!%(-(%-".'"$)%$-",#&-"+)'$!"#(''-'&'+%,*#-%+)("**$'))(*",*'%%.+$&(!($-(&$-'+,(,-+*(&%(,'%&#.&.##,'*"+%'-$."&(-%*)$#%+&##+%&'(-)+&*-$&&'.*(&),*&,--'").(&%.%$,%"%"%,&!*&*)#),,&*")(#%(**$)&'-*-(+$'($'%$#&-&(%#"%#$(*%%.%!+$**#,$)&+')"!&%&"!(#%**-+%(*!(((%)+#,*,)$-"%%&**+)$")$%'(,(-&'("")(%&&%)%*%!'%*$.-%)#)'",&!),-*%)%-&"$&&%*&+"'"&+*&.*,&($,$&+%**(!#)$(&-$#*)**!'+&,((%##"*($")-.)!!$-+#$-,$)-$'+)%)(-#+&$&-%&+("$()#*,))!,)%'#''+"""(,*+$)".'!!!.)#*,#&$&%(++!$-'%."'*,*&%/$%!$%!*%$$*'(&!&$&!#,)"-###$,&$&!())-)!#-"),'.()&)#$)(',*("-%)%%$%!#.&%$&%"$$!-.)"')..')-,.('%-)(%#-),'&"-*)-.)-&'(%'&&*$%%*.(+-,%,,#*%()%$-%$+)**-(")%%..'%(($+-'(&$$-,($"#+,+(("#!#+*"+'*&"!#%'"*!%!$$(%&&#'-#.&'"#$'%*$)$$#("$(",.%!(-*-#$#((,")&)#)&**-%,+&"&,$*(%%'&'-.!'(,-$(-)%%&.(*#%#("&++-&!*+-)+-(%#',+&'&.-&'*-,$'$.%!-*&),%))($%#)#$&*)*.&)!%,$+"$"#&#'&"$""#%&$%-'".#,-,'.$%(.&"&,(-*%%,+&&&#(+&'$%-"%+,$.'#%$$*"#!#+&.(+&.%%")+%!")),,$$*&'.((&"(',.,&*()&,*!.%&!&!$(%&%.-%('*)"(&"&##.+-$,%(().%"$,))#)-'(&'(,'%%,,&$)*+,.&*.%!"',"'!*&,)#*(.'-$&"!.***,!').+)!),'#,',"#.#*%).,&"$&"%'%%--%"-)*+#$!)"'#&($*)(.!.$*"''&+&)&,)#'+(*+(#$*($&),.($-#-%)+"&(*$&*%-'$/(,#$-"&%"($/+##"'+--(')+-*&$(*(#,"*+"#"'"%',,)%*+$)*!)'%&$,--&&+&#(-)!&!&$,($'%!!$*"-'*)+(.".*$*$$)$$&$-.''$-#(.&)&)(#))))*.*.".$"#&'&&+,$&#%-#".'"'.*'.*%.&%&*+',(-(%!&'&-$)!(),!'$*!$+)''$-$"&&!)!%)+#!)*"!%"-%-#&$)',#!#"((+.(&.$&+*&'&,'*-*&+$+!,)-!)&&$%$(#((&.*"*#'(.(+*!(%$.#,,#-((',""")!+.*$*-*)$)'%!&)(&*+($)$#'&$)+-.($*+..+.$++-,#'&')$',$!.+&'*$+*#.+-%+"$&-&%#'+(&+#,(&,-')&$$'%&'"*+$#&("",#)!"&+#(*,+)&&&-"-$'+)'--)%,"+"(&#)"$#)%&&%*$*)%.$"*$()#(&!)#'(,((-$*-'"&,$.,$$,')."$.$$*(,#"+&.%%'&)!,&,"%")--"($#,$*!%)$'"&,&*'-'#.%'#*$(%",*)#,,*-$.-,+-$-)%'(*--"+&*.!"&-,#!!%"%(&$&#'/&#%$".#!&%+#)+$")&(!+-&",+()%")"(,$!$+&("#"*%)#%"*)+("$+$,&'$!-#$#$+--&."'"%#$.($)*$*...+&,!**(-,**#)''"&(*-#*'**(-''$+)..!))((-)*&)$-)(++,%,&!(%%.+(#''",#*&".!+&)&%)%'&$*%-*##&'%-(#*).$*&)"*+'+#"',($"$,%%!,*'#-#%)#)$,%--"**$,,)"*-'%*,)#!*)$)",%$,,&#*))&*,)-,+)*%.,'.%'*-"%%$.-"+)%-%$%!"%-$"-%*$#'('&%*&.&!%%#&%&%-.)-&%$)+'.'*'&%%"-%+$*,'&--))$()$,!'&)%$!#$',,+-*#*-$(&.,))'$('&*'%#(-)("-&-+'$'!.+$-(#"$#)&*##+.&$-%-++,#")*%'"(%%+$+-*%-'!!)#&&&$#%!-#$%#$-)%'%-)'"-$&,*+$*-(#%#).$'#+#%&"%!.(%..&-(,+%(%%'%,#''#!%)#'%'((--"$-'),!'.)&).%.+&+$-+-(%&+#%*!$,$'%$%)!)$'%(#))(&$+(#*+('-$(%,&,%""")#"%-&$!&,-+#**%&,$%+&-&%*$.#%!#+"&",(&$,&&.'&$$%+-&$,+,,!+.*##,$--&,%!)"*)'#$#+.!#($.&"%'&&*,$%+'*.#"!$,!"&$##($%'*)+,!*(+&'."('#()$-$'.),',%&&!,#*"%"*,+'*+(%)*$)%#$"#,%)!)"*)%,%%'*"&(-"*'**'-)##,%("((##.,#%",)))"&(.+$+(%*'.,-,',-"&&(*$*.%..%(&)#%'*'%/$$.&'#-%&."#-+%)#!.#%('%"$"+"'#$)')'-)+)-,+,"#,$',)#,#&-*,"*)$&)"&#.&-)%#''#.**')&-+-**((#*%*".++"##%++&)%.-")*!"),.&--%!))#!--+!!%*###++&!,"*$'&&+#%*')*'+*&.,+!$'').)&&)),&+*,('%#$,#-%-."%,,)&*#")*,&&')#)%.$,)&"'(&#+&$"$'*!(#"$--.)+.,&%$.',*,*$$$&(+,$$,(+".&%%*/")$,%&,*((()#,+,+(*%#(,%.$!%'*)(*,&.(("'#)"(,-'.,#)!",)!"*)(+%"'#,+!&,','(+*,$*)*&),$%$%&,-##%))#+)-*(#(#-%.'.!%&)'.&)%+,'-!,-!&+")+)"&#!)",)##&#(%,(.)',-,!$,&%-&#(''),##*#($,"#'-*,-()#&)&)&"(+#')!'+'-,!'",-(&+*(*$$,.-(())''$%'$#)*"'*,.-+(%($$),',,*'+&#)$%-*,"*('$)%#)'"%!%(.'!,*+%'(*#+&&,%,($%*,)$'')+'%%',*)(,-"&"(#&!.",($')''$..,"'$+%",%!)$"")!$&&(,)&"#&-))$),%&"&"'$%'--.%(%,.*&#&#$'('$(*#&&(++)%&!$)-#.''++((..*(,)&&##(('#&),"&$$!#*..(##$*''+&%&,*(.%',+-,!.$$($%%()$*,+&%%$"..,*($&"&'&$')+*$!&-("')$%"-().(.+)(!%.&,*-&&*'.#$$"%*+-.$,-,+'#-'*.+").#.&'*#+,&%-""!.&$+)'-#),-%$*+$),$&*--,#%#&)(&)%)+.""&*.%&'&(*"*-+'$)%)%+&.%,..,#-%$&&("#!')",&!)$(),(%(.)%#)%.#((*-.).".+#")-&%"%&%-*-,&(.&#%-*-)!,(&%(#+),)''"+-&).),%#$*%'.&&)&($

Thanks,

Hajime Suzuki

isovic commented 7 years ago

Dear Hajime, Thank you for your detailed bug report and patch suggestions! They were very valuable! The problem you reported, as well as the one reported by Rob, were both caused by a recently-introduced bug in Edlib. I compiled a bug report based on your examples and forwarded it to the Edlib developer. He fixed the issue, and I integrated the new version in graphmap's codebase. It seems to resolve the issues you and Rob have reported, on the sample data you attached. Thank you again for your detailed help and effort you put in to provide it!

I will leave this thread open, because not all problems were resolved - David's issue is a bit different and unrelated to these ones.

Best regards, Ivan.

ocxtal commented 7 years ago

Hi Ivan,

Thank you for your support and detailed fix report. I've confirmed that the updated graphmap properly processed the sequence that I had reported on my environments.

Thank you very much, and best regards.

Hajime Suzuki

gringer commented 7 years ago

I'm still getting segfaults when running graphmap, mapping ~800Mb of reads to a mitochondrial genome, some reads which have a length larger than the mitochondrial genome length:

Version: v0.3.2 Build date: Jan 18 2017 at 14:31:03

~/install/graphmap/bin/Linux-x64/graphmap align -C -r circ_Nb_ec3_mtDNA.fasta     -d pass_all_Jodie_NbAL5_2016_all.fasta -t 10 --min-read-len 200 -o /dev/stdout |   samtools view -b | samtools sort -O BAM -o GraphMap_pass_all_Jodie_NbAL5_2016_vs_mtDNA.bam
[14:56:13 Index] Running in normal (parsimonious) mode. Only one index will be used.
[14:56:13 Index] Index already exists. Loading from file.
[14:56:14 Index] Index loaded in 0.21 sec.
[14:56:14 Index] Memory consumption: [currentRSS = 259 MB, peakRSS = 259 MB]

[14:56:14 Run] Automatically setting the maximum allowed number of regions: max. 500, attempt to reduce after 0
[14:56:14 Run] No limit to the maximum number of seed hits will be set in region selection.
[14:56:14 Run] Reference genome is assumed to be circular.
[14:56:14 Run] Only one alignment will be reported per mapped read.
[14:56:14 ProcessReads] Reads will be loaded in batches of up to 1024 MB in size.
[14:56:22 ProcessReads] Batch of 116913 reads (786 MiB) loaded in 8.56 sec. (94423288267152 bases)
[14:56:22 ProcessReads] Memory consumption: [currentRSS = 1092 MB, peakRSS = 1092 MB]
[14:56:22 ProcessReads] Using 10 threads.
[14:56:23 ProcessReads] [CPU time: 18.74 sec, RSS: 1101 MB] Read: 228/116913 (0.20%) [m: 4, u: 215], length = 2457, qname: 1Dtemp_MN16602_0ad00e6fbfbb7a72_39_3_...*** Error in `/home/gringer/install/graphmap/bin/Linux-x64/graphmap': free(): invalid pointer: 0x00007f6cc8301240 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x70bcb)[0x7f6ce6c90bcb]
/lib/x86_64-linux-gnu/libc.so.6(+0x76fa6)[0x7f6ce6c96fa6]
/lib/x86_64-linux-gnu/libc.so.6(+0x7779e)[0x7f6ce6c9779e]
/home/gringer/install/graphmap/bin/Linux-x64/graphmap(+0x2006b)[0x55e0a183b06b]
/home/gringer/install/graphmap/bin/Linux-x64/graphmap(+0x64651)[0x55e0a187f651]
/home/gringer/install/graphmap/bin/Linux-x64/graphmap(+0x7d64b)[0x55e0a189864b]
/home/gringer/install/graphmap/bin/Linux-x64/graphmap(+0x6f306)[0x55e0a188a306]
/home/gringer/install/graphmap/bin/Linux-x64/graphmap(+0x75e35)[0x55e0a1890e35]
/home/gringer/install/graphmap/bin/Linux-x64/graphmap(+0x65494)[0x55e0a1880494]
/home/gringer/install/graphmap/bin/Linux-x64/graphmap(+0xed513)[0x55e0a1908513]
/home/gringer/install/graphmap/bin/Linux-x64/graphmap(+0xefc24)[0x55e0a190ac24]
/home/gringer/install/graphmap/bin/Linux-x64/graphmap(+0xfef4f)[0x55e0a1919f4f]
/usr/lib/x86_64-linux-gnu/libgomp.so.1(+0x166b6)[0x7f6ce74f26b6]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7464)[0x7f6ce7710464]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x5f)[0x7f6ce6d089df]
======= Memory map: ========
55e0a181b000-55e0a1a4a000 r-xp 00000000 08:22 4367942882                 /home/gringer/install/graphmap/bin/Linux-x64/graphmap
55e0a1c49000-55e0a1c52000 r--p 0022e000 08:22 4367942882                 /home/gringer/install/graphmap/bin/Linux-x64/graphmap
55e0a1c52000-55e0a1c55000 rw-p 00237000 08:22 4367942882                 /home/gringer/install/graphmap/bin/Linux-x64/graphmap
55e0a1c55000-55e0a1c58000 rw-p 00000000 00:00 0 
55e0a28cd000-55e0d69e6000 rw-p 00000000 00:00 0                          [heap]
7f6c8c000000-7f6c8c0cb000 rw-p 00000000 00:00 0 
7f6c8c0cb000-7f6c90000000 ---p 00000000 00:00 0 
7f6c94000000-7f6c94390000 rw-p 00000000 00:00 0 
7f6c94390000-7f6c98000000 ---p 00000000 00:00 0 
7f6c9c000000-7f6c9c4e0000 rw-p 00000000 00:00 0 
7f6c9c4e0000-7f6ca0000000 ---p 00000000 00:00 0 
7f6ca4000000-7f6ca458e000 rw-p 00000000 00:00 0 
7f6ca458e000-7f6ca8000000 ---p 00000000 00:00 0 
7f6cac000000-7f6cac4d3000 rw-p 00000000 00:00 0 
7f6cac4d3000-7f6cb0000000 ---p 00000000 00:00 0 
7f6cb4000000-7f6cb44cb000 rw-p 00000000 00:00 0 
7f6cb44cb000-7f6cb8000000 ---p 00000000 00:00 0 
7f6cbc000000-7f6cbc483000 rw-p 00000000 00:00 0 
7f6cbc483000-7f6cc0000000 ---p 00000000 00:00 0 
7f6cc4000000-7f6cc45df000 rw-p 00000000 00:00 0 
7f6cc45df000-7f6cc8000000 ---p 00000000 00:00 0 
7f6cc8000000-7f6cc8634000 rw-p 00000000 00:00 0 
7f6cc8634000-7f6ccc000000 ---p 00000000 00:00 0 
7f6ccc000000-7f6ccc487000 rw-p 00000000 00:00 0 
7f6ccc487000-7f6cd0000000 ---p 00000000 00:00 0 
7f6cd1ef9000-7f6cd1f0f000 r-xp 00000000 08:02 134218626                  /lib/x86_64-linux-gnu/libgcc_s.so.1
7f6cd1f0f000-7f6cd210e000 ---p 00016000 08:02 134218626                  /lib/x86_64-linux-gnu/libgcc_s.so.1
7f6cd210e000-7f6cd210f000 r--p 00015000 08:02 134218626                  /lib/x86_64-linux-gnu/libgcc_s.so.1
7f6cd210f000-7f6cd2110000 rw-p 00016000 08:02 134218626                  /lib/x86_64-linux-gnu/libgcc_s.so.1
7f6cd2110000-7f6cd2111000 ---p 00000000 00:00 0 
7f6cd2111000-7f6cd2911000 rw-p 00000000 00:00 0 
7f6cd2911000-7f6cd2912000 ---p 00000000 00:00 0 
7f6cd2912000-7f6cd3112000 rw-p 00000000 00:00 0 
7f6cd3112000-7f6cd3113000 ---p 00000000 00:00 0 
7f6cd3113000-7f6cd3913000 rw-p 00000000 00:00 0 
7f6cd3913000-7f6cd3914000 ---p 00000000 00:00 0 
7f6cd3914000-7f6cd4114000 rw-p 00000000 00:00 0 
7f6cd4114000-7f6cd4115000 ---p 00000000 00:00 0 
7f6cd4115000-7f6cd4915000 rw-p 00000000 00:00 0 
7f6cd4915000-7f6cd4916000 ---p 00000000 00:00 0 
7f6cd4916000-7f6cd5116000 rw-p 00000000 00:00 0 
7f6cd5116000-7f6cd5117000 ---p 00000000 00:00 0 
7f6cd5117000-7f6cd5917000 rw-p 00000000 00:00 0 
7f6cd5917000-7f6cd5918000 ---p 00000000 00:00 0 
7f6cd5918000-7f6cd6118000 rw-p 00000000 00:00 0 
7f6cd6118000-7f6cd6119000 ---p 00000000 00:00 0 
7f6cd6119000-7f6ce6a1c000 rw-p 00000000 00:00 0 
7f6ce6a1c000-7f6ce6a1e000 r-xp 00000000 08:02 134281670                  /lib/x86_64-linux-gnu/libdl-2.24.so
7f6ce6a1e000-7f6ce6c1e000 ---p 00002000 08:02 134281670                  /lib/x86_64-linux-gnu/libdl-2.24.so
7f6ce6c1e000-7f6ce6c1f000 r--p 00002000 08:02 134281670                  /lib/x86_64-linux-gnu/libdl-2.24.so
7f6ce6c1f000-7f6ce6c20000 rw-p 00003000 08:02 134281670                  /lib/x86_64-linux-gnu/libdl-2.24.so
7f6ce6c20000-7f6ce6db5000 r-xp 00000000 08:02 134281559                  /lib/x86_64-linux-gnu/libc-2.24.so
7f6ce6db5000-7f6ce6fb4000 ---p 00195000 08:02 134281559                  /lib/x86_64-linux-gnu/libc-2.24.so
7f6ce6fb4000-7f6ce6fb8000 r--p 00194000 08:02 134281559                  /lib/x86_64-linux-gnu/libc-2.24.so
7f6ce6fb8000-7f6ce6fba000 rw-p 00198000 08:02 134281559                  /lib/x86_64-linux-gnu/libc-2.24.so
7f6ce6fba000-7f6ce6fbe000 rw-p 00000000 00:00 0 
7f6ce6fbe000-7f6ce70c1000 r-xp 00000000 08:02 134282192                  /lib/x86_64-linux-gnu/libm-2.24.so
7f6ce70c1000-7f6ce72c0000 ---p 00103000 08:02 134282192                  /lib/x86_64-linux-gnu/libm-2.24.so
7f6ce72c0000-7f6ce72c1000 r--p 00102000 08:02 134282192                  /lib/x86_64-linux-gnu/libm-2.24.so
7f6ce72c1000-7f6ce72c2000 rw-p 00103000 08:02 134282192                  /lib/x86_64-linux-gnu/libm-2.24.so
7f6ce72c2000-7f6ce72db000 r-xp 00000000 08:02 134218200                  /lib/x86_64-linux-gnu/libz.so.1.2.8
7f6ce72db000-7f6ce74da000 ---p 00019000 08:02 134218200                  /lib/x86_64-linux-gnu/libz.so.1.2.8
7f6ce74da000-7f6ce74db000 r--p 00018000 08:02 134218200                  /lib/x86_64-linux-gnu/libz.so.1.2.8
7f6ce74db000-7f6ce74dc000 rw-p 00019000 08:02 134218200                  /lib/x86_64-linux-gnu/libz.so.1.2.8
7f6ce74dc000-7f6ce7508000 r-xp 00000000 08:02 47223                      /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
7f6ce7508000-7f6ce7707000 ---p 0002c000 08:02 47223                      /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
7f6ce7707000-7f6ce7708000 r--p 0002b000 08:02 47223                      /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
7f6ce7708000-7f6ce7709000 rw-p 0002c000 08:02 47223                      /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
7f6ce7709000-7f6ce7721000 r-xp 00000000 08:02 134282564                  /lib/x86_64-linux-gnu/libpthread-2.24.so
7f6ce7721000-7f6ce7920000 ---p 00018000 08:02 134282564                  /lib/x86_64-linux-gnu/libpthread-2.24.so
7f6ce7920000-7f6ce7921000 r--p 00017000 08:02 134282564                  /lib/x86_64-linux-gnu/libpthread-2.24.so
7f6ce7921000-7f6ce7922000 rw-p 00018000 08:02 134282564                  /lib/x86_64-linux-gnu/libpthread-2.24.so
7f6ce7922000-7f6ce7926000 rw-p 00000000 00:00 0 
7f6ce7926000-7f6ce7949000 r-xp 00000000 08:02 134281222                  /lib/x86_64-linux-gnu/ld-2.24.so
7f6ce7af8000-7f6ce7afd000 rw-p 00000000 00:00 0 
7f6ce7b0f000-7f6ce7b48000 rw-p 00000000 00:00 0 
7f6ce7b48000-7f6ce7b49000 r--p 00022000 08:02 134281222                  /lib/x86_64-linux-gnu/ld-2.24.so
7f6ce7b49000-7f6ce7b4a000 rw-p 00023000 08:02 134281222                  /lib/x86_64-linux-gnu/ld-2.24.so
7f6ce7b4a000-7f6ce7b4b000 rw-p 00000000 00:00 0 
7fff0d602000-7fff0d623000 rw-p 00000000 00:00 0                          [stack]
7fff0d798000-7fff0d79a000 r--p 00000000 00:00 0                          [vvar]
7fff0d79a000-7fff0d79c000 r-xp 00000000 00:00 0                          [vdso]
[14:56:23 ProcessReads] [CPU time: 19.00 sec, RSS: 1098 MB] Read: 234/116913 (0.20%) [m: 4, u: 221], length = 2377, qname: 1Dtemp_MN16602_0ad00e6fbfbb7a72_375_4...
isovic commented 7 years ago

Hi David, I fixed the segfaults you reported earlier (and I hope this one is related to them), but the fixes are still not merged to master due to some new features requiring additional testing. Hopefully I'll manage to finish it this week and let you know, so you can try re-running your process. Thank you, Best regards, Ivan.

isovic commented 7 years ago

Hi David, I finaly merged the fixes to the master branch (latest commit). Any chance you can re-run your tests and tell me if it still segfaults?

Thank you, Best regards, Ivan.

wdecoster commented 7 years ago

I was trying the RNA-seq alignment you announced and get a segmentation fault.

[00:55:06 ProcessReads] Using 1 threads.
[00:55:07 ProcessReads] [CPU time: 0.93 sec, RSS: 246 MB] Read: 640/75299 (0.85%) [m: 0, u: 640], length = 784, qname: channel_69_a4f60262-4c5c-4ece-bcd0-c69292...Segmentation fault (core dumped)

Running with -t 1 identified the "offending" read (repeated 3 times to check):

@channel_69_a4f60262-4c5c-4ece-bcd0-c6929237b4e8_template:fast5basecalled//PC_nanopore_001_20170117_FN_MN16146_mux_scan_sample_id_79759_ch69_read34_strand2.fast5
TGCAGCGTTCGTTACGTATTATGGTTTCTTGGGTTTGTGTAACCTTTGTGCTCTGTTGGTACTTTGTGTTGTGTAACCGGGCCTTGAGTAGACTCCATCTAAAACAAAAACAAAAACAAAACAAGCATTGTTTGTTTGATAGTATATGTATTTCCCATTGTCATTGTAACATTATAAGTGTGAACGAACAATAAACGAGCTGTTCTGTTGTGCTACTTGTTGCCAAATTTGTTCATAGTTCAAGAATCTTCCTGCCTTCTTTGAAAGTACAAATGTTATTACTATTCACTGTTTTAAAAGCAGAAAACACCATCTGGTATTAAAAGAAACTAGGAGGCTATTGGTTTCAATTTCTAATGGCTTTATAGTGTGTCTGGGCATCCTTCTGTGCCGTGAGCGGGGTCAAATTCTGAAGACTTAAGCCTGGTACGCCGGTATAACATTCTCATAGGGCGAATCGCTGGGCAGATACTTGACAGAATTGAGGTGACCCCATAAACGGTTGGTTGCCTATAGTATTAAGTGACAACTCAATTCTGTCATTTCGAATACTCACTATTTTGAAATAATGATGCTATTCTTCTTCAGCGGAAAACTATTCAACCCTGGTAACGGCAAGGCTATCTAACAATATCTTAATTGTGTCGTGTTCCCACTAGTGTGTTAACTATTGTCTTATGTCATGAACCTATAAACAAACAATGTTGTTGTTGTTTATTTTGGTGTTCTCCTCGCCTTCTATCGCTCTTTAAAATAAAATTCTTCATCTGCCTATATAACTTTG
+


In addition, I receive this:

[00:52:32 Index] Running in normal (parsimonious) mode. Only one index will be used.
[00:52:32 Index] Index already exists. Loading from file.
[00:53:33 LoadOrGenerateTranscriptome] Index needs to be rebuilt. It was generated using an older version.
[00:53:33 LoadOrGenerateTranscriptome] Existing index is a genome, and you are trying to map to a transcriptome. Index needs to be rebuilt.

Which is, I believe, not the case.

isovic commented 7 years ago

Hi, thanks for the report! Strange my profiling didn't catch that - I would like to fix it asap. Could you by any chance share the reference and the GTF so I can reproduce? Best regards, Ivan.

wdecoster commented 7 years ago

I reproduced the segmentation fault using the fastq with just the offending read.

But here is the punchline already: I use as genome reference the hg19 igenome & for gtf GRCh37, which I now realize use incompatible chromosome names. I probably should get a bit of sleep. This also explains the message I received about rebuilding the index.

I reran using sequence and gtf from the same build (both GRCh37), and this already showed different output when building the index (didn't see this earlier):

Sequence file before transcriptome generation:
Num sequences: 25
Currently open file (only when batch loading): ''
Current batch ID: 0
Current batch starting sequence ID: 0

Transcribed sequences:
Num sequences: 196354
Currently open file (only when batch loading): ''
Current batch ID: 0
Current batch starting sequence ID: 0

Reloading the index the next time was as it should be, without rebuilding. Mapping the reads is progressing without problems, for now. Seems the index problem also explains the segmentation fault.

So this looks like a typical example of user stupidity combined with crazyness of incompatible chromosome identifiers? Apologies for incorrect "bug" report!

isovic commented 7 years ago

Nah, you actually found a good test case which I thought I handled, but seems to be buggy nonetheless. :-) Thank you! Best regards, Ivan.

gringer commented 7 years ago

Sorry, still getting a crash when mapping reads to the mitochondrial reference. It might be a different crash, but from my perspective it produces the same end result:

$ ~/install/graphmap/bin/Linux-x64/graphmap
Usage:
  /home/gringer/install/graphmap/bin/Linux-x64/graphmap tool

Options
    tool       STR   Specifies the tool to run:
                       align - the entire GraphMap pipeline.
                       owler - Overlapping With Long Erroneous Reads.

GraphMap (c) by Ivan Sovic, Mile Sikic and Niranjan Nagarajan
GraphMap is licensed under The MIT License.

Version: v0.4.1
Build date: Feb  1 2017 at 16:27:55
$ ~/install/graphmap/bin/Linux-x64/graphmap align -C -r circ-Nb-ec3-mtDNA.fasta -d bad_read_mtDNA.fastq -t 1 --min-read-len 100
[09:11:52 Index] Running in normal (parsimonious) mode. Only one index will be used.
[09:11:52 Index] Index is not prebuilt. Generating index.
[09:11:52 LoadOrGenerate] Started generating new index from file 'circ-Nb-ec3-mtDNA.fasta'...
[09:11:52 LoadOrGenerate] Storing new index to file 'circ-Nb-ec3-mtDNA.fasta.gmidx'...
[09:11:52 LoadOrGenerate] New index stored.
[09:11:52 Index] Index loaded in 0.46 sec.
[09:11:52 Index] Memory consumption: [currentRSS = 172 MB, peakRSS = 300 MB]

[09:11:52 Run] Automatically setting the maximum allowed number of regions: max. 500, attempt to reduce after 0
[09:11:52 Run] No limit to the maximum number of seed hits will be set in region selection.
[09:11:52 Run] Reference genome is assumed to be circular.
[09:11:52 Run] Only one alignment will be reported per mapped read.
@HD VN:1.0  SO:unknown
@SQ SN:Nb_mtDNA LN:13355
@PG ID:graphmap PN:graphmap CL:/home/gringer/install/graphmap/bin/Linux-x64/graphmap -C -r circ-Nb-ec3-mtDNA.fasta -d bad_read_mtDNA.fastq -t 1 --min-read-len 100  VN:v0.4.1 compiled on Feb  1 2017 at 16:27:28
[09:11:52 ProcessReads] Reads will be loaded in batches of up to 1024 MB in size.
[09:11:52 ProcessReads] Batch of 1 reads (0 MiB) loaded in 0.00 sec. (93931205259616 bases)
[09:11:52 ProcessReads] Memory consumption: [currentRSS = 172 MB, peakRSS = 300 MB]
[09:11:52 ProcessReads] Using 1 threads.
[09:11:52 ProcessReads] [CPU time: 0.00 sec, RSS: 172 MB] Read: 0/1 (0.00%) [m: 0, u: 0], length = 6396, qname: c3589779-27c9-4dfc-9d35-7aa709113174_Basecall_Al...terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted

A read/reference pair that doesn't work is attached.

bad_read_ref.zip

gringer commented 7 years ago

I ran a static analysis tool, cppcheck, through the code, which came up with the following warnings / errors:

[alignment/alignment_wrappers.cc:108] -> [alignment/alignment_wrappers.cc:87]: (warning) Either the condition 'ret_start_ambiguity!=0' is red
[alignment/alignment_wrappers.cc:109] -> [alignment/alignment_wrappers.cc:51]: (warning) Either the condition 'ret_end_ambiguity!=0' is redun
[alignment/alignment_wrappers.cc:509] -> [alignment/alignment_wrappers.cc:514]: (warning) Either the condition 'if(positions)' is redundant o
[alignment/alignment_wrappers.cc:510] -> [alignment/alignment_wrappers.cc:514]: (warning) Either the condition 'if(positions)' is redundant o
[alignment/alignment_wrappers.cc:556] -> [alignment/alignment_wrappers.cc:561]: (warning) Either the condition 'if(positions)' is redundant o
[alignment/alignment_wrappers.cc:557] -> [alignment/alignment_wrappers.cc:561]: (warning) Either the condition 'if(positions)' is redundant o
[alignment/alignment_wrappers.cc:599]: (error) Memory leak: converted_data
[alignment/alignment_wrappers.cc:718]: (error) Memory leak: converted_data
[alignment/cigargen.cc:25] -> [alignment/cigargen.cc:26]: (warning) Either the condition 'if(cigar)' is redundant or there is possible null p
[alignment/cigargen.cc:278]: (error) Memory leak: alignment_with_clipping
[index/index.cc:464]: (error) Memory leak: new_header
[index/index.cc:605]: (error) Memory leak: new_header
[index/index_hash.cc:11]: (warning) Member variable 'IndexHash::num_sequences_' is not initialized in the constructor.
[index/index_hash.cc:11]: (warning) Member variable 'IndexHash::data_length_' is not initialized in the constructor.
[index/index_hash.cc:11]: (warning) Member variable 'IndexHash::num_sequences_forward_' is not initialized in the constructor.
[index/index_hash.cc:11]: (warning) Member variable 'IndexHash::data_length_forward_' is not initialized in the constructor.
[index/index_hash.cc:11]: (warning) Member variable 'IndexHash::data_ptr_' is not initialized in the constructor.
[index/index_hash.cc:185]: (error) Memory leak: kmer_countdown
[index/index_owler.cc:480]: (error) Common realloc mistake: 'all_hits' nulled but not freed upon failure
[owler/dpfilter.cc:25] -> [owler/dpfilter.cc:20]: (warning) Either the condition 'prev!=0' is redundant or there is possible null pointer der
[owler/dpfilter.cc:26] -> [owler/dpfilter.cc:20]: (warning) Either the condition 'prev!=0' is redundant or there is possible null pointer der
[owler/dpfilter.cc:55] -> [owler/dpfilter.cc:50]: (warning) Either the condition 'prev!=0' is redundant or there is possible null pointer der
[owler/dpfilter.cc:56] -> [owler/dpfilter.cc:50]: (warning) Either the condition 'prev!=0' is redundant or there is possible null pointer der
[owler/dpfilter.cc:80] -> [owler/dpfilter.cc:76]: (warning) Either the condition 'prev!=0' is redundant or there is possible null pointer der
[owler/dpfilter.cc:209] -> [owler/dpfilter.cc:204]: (warning) Either the condition 'prev!=0' is redundant or there is possible null pointer d
[owler/dpfilter.cc:210] -> [owler/dpfilter.cc:204]: (warning) Either the condition 'prev!=0' is redundant or there is possible null pointer d
[owler/process_read.cc:1595] -> [owler/process_read.cc:1603]: (warning) Either the condition 'new_cluster!=0' is redundant or there is possib
[owler/process_read.cc:1596] -> [owler/process_read.cc:1603]: (warning) Either the condition 'new_cluster!=0' is redundant or there is possib
[owler/process_read.cc:1735] -> [owler/process_read.cc:1742]: (warning) Either the condition 'new_cluster!=0' is redundant or there is possib
[owler/process_read.cc:1736] -> [owler/process_read.cc:1742]: (warning) Either the condition 'new_cluster!=0' is redundant or there is possib
[program_parameters.cc:197]: (warning) fprintf format string requires 0 parameters but 1 is given.

I expect that the memory leak errors should at least be looked at in more detail.

gringer commented 7 years ago

The first one that I checked (converted_data) could potentially be fixed by replacing uint8_t * with std::vector, but the rabbit hole of templated functions was a bit too deep for me to make a quick fix.

gringer commented 7 years ago

regarding the index_hash errors, it looks like you've tried using a std::vector to store indexes to a hash table of counts. It would probably work out less painful (for debugging and coding) if a std::unordered_map were used instead to replace the hash table:

std::unordered_map<int64_t,int64_t> kmer_counts;
...
if(kmer_counts.count(hash_key) == 1){ // if hash_key exists with any count
  kmer_counts[hash_key] += 1;
} else {
  kmer_counts[hash_key] = 1;
}

It's possible to abstract this further and use a unordered_multiset (which doesn't need the existence check), but then getting the total number of kmers and iterating through the set of unique values is a bit harder.

isovic commented 7 years ago

Hi David! Thanks for the reports! Cppcheck seems like a great tool - I usually only use Valgrind and GDB. The converted_data is actually used only in the two functions with experimental code, which in the normal circumstances never gets called or even compiled. Such behavior is not a good thing in any case. I fixed it locally nonetheless by using std::vector, and will push it as soon as I resolve another issue - currently I'm trying to debug why the indexing breaks on hg19 reference (another issue), but it's going slowly due to its size. Using STL is something I need to do in a lot of places in the code, don't know why I actually used the manual allocations at this point.

As for the issue you reported earlier, regarding the circular genome alignment - I should be able to resolve it soon. I'll let you know of the progress!

Thank you for your suggestions and input! These are greatly appreciated!

Best regards, Ivan.

isovic commented 7 years ago

Hi David, here's an update. I've re-implemented the entire index, as well as fixed a couple of issues related to aligning to circular genomes. The fix will be included in the next release which is coming soon (within a week, hopefully), together with some new goodies such as speed improvements on larger references. Best regards, Ivan.

filicado commented 7 years ago

How does graphmap handle ambiguous DNA letters in the reference?

e.g. K for G/T, S for G/C or V for G/C/A

I noticed a segmentation fault when using a reference with such letters (but not always), but not problem in the same reference sequence without them. The index seems to builds fine but then:

Segmentation fault (core dumped) graphmap align -r ../longer.fasta -d ../full.fastq -o ../out.sam [E::sam_parse1] SEQ and QUAL are of different length [W::sam_read1] parse error at line 11 [main_samview] truncated file.

Nik

isovic commented 7 years ago

Hi everyone,

there have been many updates and changes, and in the latest version I (hopefully) addressed all of the above issues. Would you mind giving it a spin to verify if everything is well now?

@filicado Ambiguous DNA bases are treated as Ns and simply skipped during seed lookup. However, during alignment they are considered as part of the alphabet (individual characters) and will have tried to have been aligned to other same characters (K would be matched to K and not to G/T). Your segfault error is interesting - could you try the same with the latest commit?

Best regards, Ivan.

gringer commented 7 years ago

I've had no further mapping problems with the new version of GraphMap; everything seems to be working fine.

claumer commented 7 years ago

Hello Ivan,

I'm writing to report that I've been working with a dataset which is causing segfaults in graphmap, but curiously only when running with the anchored alignment algorithm -- the semi-global (both gotoh and normal) algorithms seem to be able to handle them well (caveat: the jobs haven't yet completed but are about 1/4 of the way through 4.4M reads). Attached is a sample of the offending file, which I've verified can reproduce the segfault. I'm also attaching an example logfile. I'm running from version 0.5.1, compiled with gcc 6.3, on a CentOS 6 cluster.

These reads came from a very low-input sample (<400 pg), amplified with Illumina adapters/PCR indexing primers, but the adapters were trimmed off of the reads at the 5' and 3' ends with cutadapt. One of the purposes of doing this experiment is to investigate evidence for/quantify chimeric reads originating through PCR recombination in this novel protocol. If this could indeed be occurring, is this a possible cause of the segfault? Or is it a more straightforward bug?

Let me know if any other information could be useful in investigating this...!

Best, Chris L

NA12878_cutadapt_first10000.fastq.gz graphmap_segfault_log.txt