mengyao / Complete-Striped-Smith-Waterman-Library

298 stars 113 forks source link

Inconsistent cigar and cigar_string in the C++ wrapper when -O3 is used #41

Closed bowhan closed 7 years ago

bowhan commented 7 years ago

Found that different alignment.cigar and alignment.cigar_string are reported under different optimization levels. To reproduce this behavior:

// test.cpp
#include "ssw_cpp.h"
#include <iostream>
#include <string>

using namespace std;

int main() {
    string t = "GTCACAAGGAGAGGACTGGGGGACTGCCTGGCACAGAGCAGTTGCTGATCAACACAGCTGCAGCCAGGGCTGAGAAGATGACAAGCATTTCCTCTGTCAGGAGAGACATCCATGCCACTCCAGGGCCCTCCCCATCCCAGGAAGGCCCCTCCAAGCACCCAGTCTCTAACACAGCCCACTTCCTCACAGCTGAGCCCCCCACTGTGGTGACTGGCAGTCTCCTAGTGGGACCAGTGAGCGACTGCTCCACCCTGCCCTGCCTGCCACTGCCTGCGCTGTTCAACCAGGAGCCAGCCTCCGGCCAGATGCGCCTGGAGAAAACCGACCAGATTCCCAGTATGTTAGGGGGCTTGGAGAGAGTGGGCTTTCTCCCTCTTGGGAGGTGGATAAGGAGTTGACAC";
    string q = "ATTTCCTCTGTCAGGAGAGACATCCATGCCACTCCAAACATGAAAACCCATTTCAGATTTTTTCAAGACTGTGGTGACTGGCAGTCTCCTAGTGGGACCAGTGAGCGACTGCTCCACCCTGCCCTGCCTGCCACTGCCTGCGCTGTTCAACCAGGAGCCAGCCTCCGGCCAGATGCGCCTGGAGAAAACCGACCAGATTCCCAGTATGTTAGGG";
    uint8_t match_score = 2;
    uint8_t mismatch_penalty = 2;
    uint8_t gap_open_penalty = 3;
    uint8_t gap_ext_penalty  = 1;
    StripedSmithWaterman::Aligner aligner{match_score, mismatch_penalty, gap_open_penalty, gap_ext_penalty};
    aligner.SetReferenceSequence(t.c_str(), t.size());
    StripedSmithWaterman::Filter filter;
    StripedSmithWaterman::Alignment alignment;
    aligner.Align(q.c_str(), filter, &alignment);
    cerr << alignment.cigar_string << endl;
    return 0;
}
g++ -std=c++11 -O2 -o o2 -I . test.cpp ssw_cpp.cpp ssw.c && ./o2
36=11D1=2D1X2=1X3=9D1=3D5=1D1=1X1=6D3=3D1=1X2=2X3=1D2=11D147=

g++ -std=c++11 -O3 -o o3 -I . test.cpp ssw_cpp.cpp ssw.c && ./o3
36=10X1=1X2D1=5X2=5X1=1X4=6X2=2X2=2X2=4X2=3X1=5X3=7X1=2X1=1X1=5X2=1X1=2X2=3X1=2X2=4X1=2X1=3X1=1X1=1X1=1X1=5X1=2X2=5X1=4X3=2X3=11X1=1X1=13X1=2X2=2X1=1X1=9X1=1X3=4X1=2X1=7X1=7X1=3X

g++ --version
g++ (Ubuntu 5.4.1-2ubuntu1~16.04) 5.4.1 20160904
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

uname -a
Linux ip-172-31-10-44 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

This does NOT happen when the C API was used.
This does NOT happen when the same code were ran under MacOS using LLVM.

bowhan commented 7 years ago

The issue is gone when g++ 6.2.0 was used. So possibly a GNU g++ bug.