Closed ksahlin closed 5 years ago
This was implemented in a previous version but I omitted it after as we got similar results with less resources in racon. You can try commit a6dd47a
. If it seems better, I can reinstate it.
ok, will do and I'll get back to you on that.
I did git checkout a6dd47a
and followed the installation instructions (removed previous build folder) and got:
CSE-mkspm01:build kxs624$ cmake -DCMAKE_BUILD_TYPE=Release ..
-- The C compiler identification is AppleClang 7.3.0.7030031
-- The CXX compiler identification is AppleClang 7.3.0.7030031
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/kxs624/Documents/workspace/spoa/build
CSE-mkspm01:build kxs624$ make
Scanning dependencies of target spoa
[ 14%] Building CXX object CMakeFiles/spoa.dir/src/alignment_engine.cpp.o
/Users/kxs624/Documents/workspace/spoa/src/alignment_engine.cpp:26:9: error: use of undeclared identifier 'exit'
exit(1);
^
1 error generated.
make[2]: *** [CMakeFiles/spoa.dir/src/alignment_engine.cpp.o] Error 1
make[1]: *** [CMakeFiles/spoa.dir/all] Error 2
make: *** [all] Error 2
Add #include <stdlib.h>
on top of src/alignment_engine.cpp
.
ok did that. Build works fine, but run into this in the make step:
CSE-mkspm01:build kxs624$ make
Scanning dependencies of target spoa
[ 14%] Building CXX object CMakeFiles/spoa.dir/src/alignment_engine.cpp.o
[ 28%] Building CXX object CMakeFiles/spoa.dir/src/chain.cpp.o
In file included from /Users/kxs624/Documents/workspace/spoa/src/chain.cpp:9:
/Users/kxs624/Documents/workspace/spoa/src/chain.hpp:54:23: error: no type named 'FastaReader' in namespace 'bioparser'
friend bioparser::FastaReader<Chain>;
~~~~~~~~~~~^
/Users/kxs624/Documents/workspace/spoa/src/chain.hpp:54:34: error: expected member name or ';' after declaration specifiers
friend bioparser::FastaReader<Chain>;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
/Users/kxs624/Documents/workspace/spoa/src/chain.hpp:55:23: error: no type named 'FastqReader' in namespace 'bioparser'
friend bioparser::FastqReader<Chain>;
~~~~~~~~~~~^
/Users/kxs624/Documents/workspace/spoa/src/chain.hpp:55:34: error: expected member name or ';' after declaration specifiers
friend bioparser::FastqReader<Chain>;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
4 errors generated.
make[2]: *** [CMakeFiles/spoa.dir/src/chain.cpp.o] Error 1
make[1]: *** [CMakeFiles/spoa.dir/all] Error 2
make: *** [all] Error 2
Try git submodule update vendor/bioparser
from spoa root. Clear build folder and run cmake
and make
again.
I got it to run with your suggestions, thanks! I cannot evaluate the performance difference at this point as this commit does not output the graph (-d option). I would think affine would do a lot better since we are working with exon differences (i.e., larger deletions of segments), so affine would really make sense, however, at this point it is really only my intuition.
As for integrating this commit with the current version (that supposts fastq and prints dot-graph) --- It would have to be your call. I' not sure about what goes on in the code under the hood, but maybe the affine alignment could be a separate module that would not be mixed in with the default non-affine spoa code if the user decided to run as default. I definitely don't think it should be at the cost of resources for other types of alignment.
I cannot offer you anymore than a strong intuition at this point that affine gaps would help, but i could also evaluate this if the -d option was somehow introduced.
Integration won't be that much of a problem I guess. But, could you first do a couple of quick C/P and try to enable graph plots in that commit? I'll provide everything you need.
Awesome, the dot graph would serve as a basis to check if, in large, the paths makes more sense in an affine setting.
Here are the steps:
{"dot", required_argument, 0, 'd'},
before
https://github.com/rvaser/spoa/blob/a6dd47a3ccaf0c12e481b3e2476eea08fd97813d/src/main.cpp#L15std::string dot_path = "";
at
https://github.com/rvaser/spoa/blob/a6dd47a3ccaf0c12e481b3e2476eea08fd97813d/src/main.cpp#L37d:
at the end of string at
https://github.com/rvaser/spoa/blob/a6dd47a3ccaf0c12e481b3e2476eea08fd97813d/src/main.cpp#L39 graph->print_dot(dot_path);
at
https://github.com/rvaser/spoa/blob/a6dd47a3ccaf0c12e481b3e2476eea08fd97813d/src/main.cpp#L139case 'd':
dot_path = optarg;
break;
Also, FASTQ is supported with -q
option in that commit.
Hmm I think I got the steps correct but I ran into an error with make. Could the error be on my end?
CSE-mkspm01:build kxs624$ make
Scanning dependencies of target spoa
[ 14%] Building CXX object CMakeFiles/spoa.dir/src/alignment_engine.cpp.o
[ 28%] Building CXX object CMakeFiles/spoa.dir/src/chain.cpp.o
[ 42%] Building CXX object CMakeFiles/spoa.dir/src/graph.cpp.o
/Users/kxs624/Documents/workspace/spoa/src/graph.cpp:629:19: error: implicit instantiation of undefined template 'std::__1::basic_ofstream<char, std::__1::char_traits<char> >'
std::ofstream out(path);
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/iosfwd:133:33: note: template is declared here
class _LIBCPP_TYPE_VIS_ONLY basic_ofstream;
^
/Users/kxs624/Documents/workspace/spoa/src/graph.cpp:637:57: error: no member named 'endl' in namespace 'std'; did you mean 'end'?
out << "digraph " << num_sequences_ << " {" << std::endl;
~~~~~^~~~
end
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/initializer_list:109:1: note: 'end' declared here
end(initializer_list<_Ep> __il) _NOEXCEPT
^
/Users/kxs624/Documents/workspace/spoa/src/graph.cpp:638:45: error: no member named 'endl' in namespace 'std'; did you mean 'end'?
out << " graph [rankdir=LR]" << std::endl;
~~~~~^~~~
end
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/initializer_list:109:1: note: 'end' declared here
end(initializer_list<_Ep> __il) _NOEXCEPT
^
/Users/kxs624/Documents/workspace/spoa/src/graph.cpp:645:28: error: no member named 'endl' in namespace 'std'; did you mean 'end'?
out << "]" << std::endl;
~~~~~^~~~
end
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/initializer_list:109:1: note: 'end' declared here
end(initializer_list<_Ep> __il) _NOEXCEPT
^
/Users/kxs624/Documents/workspace/spoa/src/graph.cpp:653:32: error: no member named 'endl' in namespace 'std'; did you mean 'end'?
out << "]" << std::endl;
~~~~~^~~~
end
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/initializer_list:109:1: note: 'end' declared here
end(initializer_list<_Ep> __il) _NOEXCEPT
^
/Users/kxs624/Documents/workspace/spoa/src/graph.cpp:658:70: error: no member named 'endl' in namespace 'std'; did you mean 'end'?
out << " [style = dotted, arrowhead = none]" << std::endl;
~~~~~^~~~
end
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/initializer_list:109:1: note: 'end' declared here
end(initializer_list<_Ep> __il) _NOEXCEPT
^
/Users/kxs624/Documents/workspace/spoa/src/graph.cpp:662:24: error: no member named 'endl' in namespace 'std'; did you mean 'end'?
out << "}" << std::endl;
~~~~~^~~~
end
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/initializer_list:109:1: note: 'end' declared here
end(initializer_list<_Ep> __il) _NOEXCEPT
^
7 errors generated.
make[2]: *** [CMakeFiles/spoa.dir/src/graph.cpp.o] Error 1
make[1]: *** [CMakeFiles/spoa.dir/all] Error 2
make: *** [all] Error 2
Ugh, forgot one header. Add #include <fstream>
on top of src/graph.cpp
(and maybe #include <iostream>
).
I also need the affine gaps aligner for my project. When trying to compile the mentioned revision with g++ (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609 I get the following:
[ 14%] Building CXX object CMakeFiles/spoa.dir/src/alignment_engine.cpp.o [ 28%] Building CXX object CMakeFiles/spoa.dir/src/chain.cpp.o [ 42%] Building CXX object CMakeFiles/spoa.dir/src/graph.cpp.o [ 57%] Building CXX object CMakeFiles/spoa.dir/src/main.cpp.o [ 71%] Building CXX object CMakeFiles/spoa.dir/src/simd_alignment_engine.cpp.o In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/xmmintrin.h:1249:0, from /usr/lib/gcc/x86_64-linux-gnu/5/include/x86intrin.h:31, from /usr/include/x86_64-linux-gnu/c++/5/bits/opt_random.h:33, from /usr/include/c++/5/random:50, from /usr/include/c++/5/bits/stl_algo.h:66, from /usr/include/c++/5/algorithm:62, from /research/ium125/tmp/spoa/src/simd_alignment_engine.cpp:8: /research/ium125/tmp/spoa/src/simd_alignment_engine.cpp: In static member function 'static spoa::mxxxi spoa::InstructionSet
::_mmxxx_slli_si(const __mxxxi&)': /research/ium125/tmp/spoa/src/simd_alignment_engine.cpp:143:16: error: the last argument must be an 8-bit immediate return _mm_slli_si128(a, sizeof(type)); ^ /research/ium125/tmp/spoa/src/simd_alignment_engine.cpp: In static member function 'static spoa:: mxxxi spoa::InstructionSet::_mmxxx_srli_si(const mxxxi&)': /research/ium125/tmp/spoa/src/simd_alignment_engine.cpp:146:16: error: the last argument must be an 8-bit immediate return _mm_srli_si128(a, (kNumVar - 1) * sizeof(type)); ^ /research/ium125/tmp/spoa/src/simd_alignment_engine.cpp: In static member function 'static spoa::__mxxxi spoa::InstructionSet ::_mmxxx_slli_si(const mxxxi&)': /research/ium125/tmp/spoa/src/simd_alignment_engine.cpp:170:16: error: the last argument must be an 8-bit immediate return _mm_slli_si128(a, sizeof(type)); ^ /research/ium125/tmp/spoa/src/simd_alignment_engine.cpp: In static member function 'static spoa::__mxxxi spoa::InstructionSet::_mmxxx_srli_si(const __mxxxi&)': /research/ium125/tmp/spoa/src/simd_alignment_engine.cpp:173:16: error: the last argument must be an 8-bit immediate return _mm_srli_si128(a, (kNumVar - 1) * sizeof(type)); ^ CMakeFiles/spoa.dir/build.make:158: recipe for target 'CMakeFiles/spoa.dir/src/simd_alignment_engine.cpp.o' failed make[2]: [CMakeFiles/spoa.dir/src/simd_alignment_engine.cpp.o] Error 1 CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/spoa.dir/all' failed make[1]: [CMakeFiles/spoa.dir/all] Error 2 Makefile:127: recipe for target 'all' failed make: *** [all] Error 2
Try replacing the following:
https://github.com/rvaser/spoa/blob/a6dd47a3ccaf0c12e481b3e2476eea08fd97813d/src/simd_alignment_engine.cpp#L143 with return _mm_slli_si128(a, 2);
https://github.com/rvaser/spoa/blob/a6dd47a3ccaf0c12e481b3e2476eea08fd97813d/src/simd_alignment_engine.cpp#L146 with return _mm_srli_si128(a, 14);
https://github.com/rvaser/spoa/blob/a6dd47a3ccaf0c12e481b3e2476eea08fd97813d/src/simd_alignment_engine.cpp#L170 with return _mm_slli_si128(a, 4);
https://github.com/rvaser/spoa/blob/a6dd47a3ccaf0c12e481b3e2476eea08fd97813d/src/simd_alignment_engine.cpp#L173 with return _mm_srli_si128(a, 12);
@rvaser thanks seems to work!
I made some preliminary tests and affine gap penalties definitely works better than the default when aligning transcripts. I would vote for this feature being included in some way.
I will try to push affine gaps to master next week.
Affine gaps are available on branch temp
(SISD version only, edit CMakeLists to compile without SIMD). SIMD version coming soon.
@ksahlin, @ilyaminkin, affine gaps are now live on master branch (v2.0.0
). I sped it up by applying prefix max on rows.
Affine gaps had some bugs which should be fixed now in v2.0.1
.
Hi again!
I had the chance to try out the new commit (v2.0.2). Sorry for being a pain...but I was quite happy with being able to set -e 0 in the first affine version you implemented. Is there a good reason why you removed this setting?
By first version you mean v2.0.0
or the earlier commit?
The earlier commit.
On a positive note FYI: I do se slightly improved alignments with the latest version vs the earlier commit with identical alignment parameters, so maybe -e 0
combined with latest version will give the holy grail answer :)
Enabled in latest commit (v2.0.3
).
Hi again,
[With the risk of asking something obvious once reading the POA-paper] Is it possible (or practically feasible) implement affine gap penalty when aligning a sequence to the graph?