Closed jserv closed 2 years ago
Thanks for this, I am happy to help with the reworking into SSE2NEON if needed. I did also extend the SSE2NEON test bench, located here: https://github.com/joshlvmh/iqtree_arm_neon/tree/7bc67d3449428b0b683fb7359c8945da218f5775/iq_tree_tests/sse2neon Please use any of this if you need.
Unimplemented intrinsics:
_mm_max_pd
_mm_min_pd
_mm_round_pd
_mm_insert_ps
_mm_sqrt_pd
_mm_cmple_pd
_mm_cmplt_pd
_mm_cmpneq_pd
_mm_hadd_pd
_mm_store_sd
_mm_cvttpd_epi32
_mm_cvtpd_epi32
_mm_cvtepi32_pd
_mm_getcsr
_mm_storel_pd
@joshlvmh, Recently, the necessary SSE intrinsics required by iqtree_arm_neon are implemented in latest SSE2NEON
. Can you take a try?
All intrinsics required by iqtree_arm_neon were implemented.
While transiting from SSE intrinsics used by IQ-TREE to Arm NEON, @joshlvmh extended SSE2NEON with additional intrinsics. See https://github.com/joshlvmh/iqtree_arm_neon/blob/7bc67d3449428b0b683fb7359c8945da218f5775/IQ-TREE/sse2neon.h#L4136
It would be great if we can rework these changes back to SSE2NEON.