DLTcollab / sse2neon

A translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation
MIT License
1.3k stars 208 forks source link

Missing intrinsics required for building Cycles #66

Closed jserv closed 4 years ago

jserv commented 4 years ago

Cycles is Blender's physically-based path tracer for production rendering. Recently, Apple Inc. sent Aarch64 patch to Blender developer as shown on D8237: Cycles: support neon instructions for arm64 processors. However, file intern/cycles/util/util_sse_to_neon.h provided by D8237 was incomplete, and Blender developers were discussing the integration of sse2neon.

Missing intrinsics in SSE2NEON required for building Cycles:

D8237 also defines _mm_cmple_epi32 and _mm_cmpge_epi32, which are not part of SSE intrinsics.

Fortunately, the simplified implementations are available in D8237.

jserv commented 4 years ago

Only one instruction is missing for building Cycles. That is _mm_castps_pd as described in #91.