Closed amkozlov closed 6 years ago
outstanding :-)
On 21.09.2016 15:31, Alexey Kozlov wrote:
As already discussed, current tip-inner functions are not efficient for state-rich models (e.g. AA data). In fact, performance was even worse than inner-inner even for reasonably long sequences (-1000 AA sites). Alternative approach based on lookup table (i.e. the one used for DNA resp. 4x4 model) is much faster.
I implemented it in a separate set of AVX kernels for 20x20 model (AA), and it yielded speedups of 3x-5x for individual kernels (s. attached profiles) and ~2x for the treesearch.
TODO:
- implement SSE version
- implement generic version (any number of states)
aa_edgelh_before https://cloud.githubusercontent.com/assets/5624530/18676020/90576ff6-7f54-11e6-9317-6b9cd221ca87.png
aa_edgelh_after https://cloud.githubusercontent.com/assets/5624530/18676040/988dea88-7f54-11e6-8984-40a8d8015fef.png
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/xflouris/libpll/issues/114, or mute the thread https://github.com/notifications/unsubscribe-auth/AA1w-gefJqlVS1eE4bq2zNv9yHhMFT_Gks5qsTGrgaJpZM4KCzVX.
Alexandros (Alexis) Stamatakis
Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology Adjunct Professor, Dept. of Ecology and Evolutionary Biology, University of Arizona at Tucson
www.exelixis-lab.org
As already discussed, current tip-inner functions are not efficient for state-rich models (e.g. AA data). In fact, performance was even worse than inner-inner even for reasonably long sequences (-1000 AA sites). Alternative approach based on lookup table (i.e. the one used for DNA resp. 4x4 model) is much faster.
I implemented it in a separate set of AVX kernels for 20x20 model (AA), and it yielded speedups of 3x-5x for individual kernels (s. attached profiles) and ~2x for the treesearch.
TODO: