xiph / opus

Modern audio compression for the internet.
https://opus-codec.org/
Other
2.31k stars 619 forks source link

build failed when disable AVX on Windows with MSVC clang-cl compiler #256

Open wangyoucao577 opened 2 years ago

wangyoucao577 commented 2 years ago

Hi,

I'm trying to disable AVX on windows compilation to compatible with old CPUs that no AVX instructions support. I uses clang-cl compiler(clang 12) that installed in latest Visual Studio 2022, but the build is failed. Here's the reproduction and error logs.

> mkdir -p build && cd build 
> cmake .. -GNinja -DCMAKE_C_COMPILER=clang-cl -DOPUS_X86_MAY_HAVE_AVX=OFF
> ninja
[145/149] Building C object CMakeFiles\opus.dir\silk\x86\VQ_WMat_EC_sse4_1.c.obj
FAILED: CMakeFiles/opus.dir/silk/x86/VQ_WMat_EC_sse4_1.c.obj
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\bin\clang-cl.exe  /nologo -DENABLE_HARDENING -DHAVE_CONFIG_H -DHAVE_LRINT -DHAVE_LRINTF -DOPUS_BUILD -DOPUS_HAVE_RTCD -DOPUS_X86_MAY_HAVE_SSE -DOPUS_X86_MAY_HAVE_SSE2 -DOPUS_X86_MAY_HAVE_SSE4_1 -DVAR_ARRAYS -D_CRT_SECURE_NO_WARNINGS -IE:\opus\include -IE:\opus\build -IE:\opus -IE:\opus\celt -IE:\opus\silk -IE:\opus\silk\float /DWIN32 /D_WINDOWS /W3 /MDd /Zi /Ob0 /Od /RTC1 /GS /showIncludes /FoCMakeFiles\opus.dir\silk\x86\VQ_WMat_EC_sse4_1.c.obj /FdCMakeFiles\opus.dir\opus.pdb -c -- E:\opus\silk\x86\VQ_WMat_EC_sse4_1.c
E:\opus\silk\x86\VQ_WMat_EC_sse4_1.c(88,26): error: always_inline function '_mm_cvtepi8_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_VQ_WMat_EC_sse4_1' that is compiled without support for 'sse4.1'
        v_cb_row_31_Q7 = OP_CVTEPI8_EPI32_M32( &cb_row_Q7[ 1 ] );
                         ^
E:\opus\celt/x86/x86cpu.h(68,3): note: expanded from macro 'OP_CVTEPI8_EPI32_M32'
 (_mm_cvtepi8_epi32(_mm_cvtsi32_si128(OP_LOADU_EPI32(x))))
  ^
E:\opus\silk\x86\VQ_WMat_EC_sse4_1.c(90,26): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_VQ_WMat_EC_sse4_1' that is compiled without support for 'sse4.1'
        v_cb_row_31_Q7 = _mm_mul_epi32( v_XX_31_Q17, v_cb_row_31_Q7 );
                         ^
E:\opus\silk\x86\VQ_WMat_EC_sse4_1.c(91,26): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_VQ_WMat_EC_sse4_1' that is compiled without support for 'sse4.1'
        v_cb_row_42_Q7 = _mm_mul_epi32( v_XX_42_Q17, v_cb_row_42_Q7 );
                         ^
3 errors generated.
[147/149] Building C object CMakeFiles\opus.dir\silk\x86\NSQ_sse4_1.c.obj
FAILED: CMakeFiles/opus.dir/silk/x86/NSQ_sse4_1.c.obj
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\bin\clang-cl.exe  /nologo -DENABLE_HARDENING -DHAVE_CONFIG_H -DHAVE_LRINT -DHAVE_LRINTF -DOPUS_BUILD -DOPUS_HAVE_RTCD -DOPUS_X86_MAY_HAVE_SSE -DOPUS_X86_MAY_HAVE_SSE2 -DOPUS_X86_MAY_HAVE_SSE4_1 -DVAR_ARRAYS -D_CRT_SECURE_NO_WARNINGS -IE:\opus\include -IE:\opus\build -IE:\opus -IE:\opus\celt -IE:\opus\silk -IE:\opus\silk\float /DWIN32 /D_WINDOWS /W3 /MDd /Zi /Ob0 /Od /RTC1 /GS /showIncludes /FoCMakeFiles\opus.dir\silk\x86\NSQ_sse4_1.c.obj /FdCMakeFiles\opus.dir\opus.pdb -c -- E:\opus\silk\x86\NSQ_sse4_1.c
E:\opus\silk\x86\NSQ_sse4_1.c(345,22): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    a_Q12_01234567 = _mm_shuffle_epi8( a_Q12_01234567, xmm_one );
                     ^
E:\opus\silk\x86\NSQ_sse4_1.c(346,22): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    a_Q12_89ABCDEF = _mm_shuffle_epi8( a_Q12_89ABCDEF, xmm_one );
                     ^
E:\opus\silk\x86\NSQ_sse4_1.c(357,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    xmm_tempa = _mm_shuffle_epi8( xmm_tempa, xmm_one );
                ^
E:\opus\silk\x86\NSQ_sse4_1.c(358,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    xmm_tempb = _mm_shuffle_epi8( xmm_tempb, xmm_one );
                ^
E:\opus\silk\x86\NSQ_sse4_1.c(366,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    xmm_tempa = _mm_shuffle_epi8( xmm_tempa, xmm_one );
                ^
E:\opus\silk\x86\NSQ_sse4_1.c(367,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    xmm_tempb = _mm_shuffle_epi8( xmm_tempb, xmm_one );
                ^
E:\opus\silk\x86\NSQ_sse4_1.c(376,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    xmm_tempa = _mm_shuffle_epi8( xmm_tempa, xmm_one );
                ^
E:\opus\silk\x86\NSQ_sse4_1.c(377,17): error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'ssse3'
    xmm_tempb = _mm_shuffle_epi8( xmm_tempb, xmm_one );
                ^
E:\opus\silk\x86\NSQ_sse4_1.c(394,33): error: '__builtin_ia32_palignr128' needs target feature ssse3
        psLPC_Q14_hi_89ABCDEF = _mm_alignr_epi8( psLPC_Q14_hi_01234567, psLPC_Q14_hi_89ABCDEF, 2 );
                                ^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\tmmintrin.h(148,12): note: expanded from macro '_mm_alignr_epi8'
  (__m128i)__builtin_ia32_palignr128((__v16qi)(__m128i)(a), \
           ^
E:\opus\silk\x86\NSQ_sse4_1.c(395,33): error: '__builtin_ia32_palignr128' needs target feature ssse3
        psLPC_Q14_lo_89ABCDEF = _mm_alignr_epi8( psLPC_Q14_lo_01234567, psLPC_Q14_lo_89ABCDEF, 2 );
                                ^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\tmmintrin.h(148,12): note: expanded from macro '_mm_alignr_epi8'
  (__m128i)__builtin_ia32_palignr128((__v16qi)(__m128i)(a), \
           ^
E:\opus\silk\x86\NSQ_sse4_1.c(442,30): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
                b_Q14_3210 = OP_CVTEPI16_EPI32_M64( b_Q14 );
                             ^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
 (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
  ^
E:\opus\silk\x86\NSQ_sse4_1.c(450,29): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
                xmm_tempa = _mm_mul_epi32( xmm_tempa, b_Q14_3210 );
                            ^
E:\opus\silk\x86\NSQ_sse4_1.c(455,37): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
                pred_lag_ptr_0123 = _mm_mul_epi32( pred_lag_ptr_0123, b_Q14_0123 );
                                    ^
E:\opus\silk\x86\NSQ_sse4_1.c(623,31): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
            xmm_xq_Q14_3210 = _mm_mul_epi32( xmm_xq_Q14_3210, xmm_Gain_Q10 );
                              ^
E:\opus\silk\x86\NSQ_sse4_1.c(624,31): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
            xmm_xq_Q14_x3x1 = _mm_mul_epi32( xmm_xq_Q14_x3x1, xmm_Gain_Q10 );
                              ^
E:\opus\silk\x86\NSQ_sse4_1.c(625,31): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
            xmm_xq_Q14_7654 = _mm_mul_epi32( xmm_xq_Q14_7654, xmm_Gain_Q10 );
                              ^
E:\opus\silk\x86\NSQ_sse4_1.c(626,31): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_10_16_sse4_1' that is compiled without support for 'sse4.1'
            xmm_xq_Q14_x7x5 = _mm_mul_epi32( xmm_xq_Q14_x7x5, xmm_Gain_Q10 );
                              ^
E:\opus\silk\x86\NSQ_sse4_1.c(633,31): error: '__builtin_ia32_pblendw128' needs target feature sse4.1
            xmm_xq_Q14_3210 = _mm_blend_epi16( xmm_xq_Q14_3210, xmm_xq_Q14_x3x1, 0xCC );
                              ^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\smmintrin.h(516,13): note: expanded from macro '_mm_blend_epi16'
  (__m128i) __builtin_ia32_pblendw128 ((__v8hi)(__m128i)(V1), \
            ^
E:\opus\silk\x86\NSQ_sse4_1.c(634,31): error: '__builtin_ia32_pblendw128' needs target feature sse4.1
            xmm_xq_Q14_7654 = _mm_blend_epi16( xmm_xq_Q14_7654, xmm_xq_Q14_x7x5, 0xCC );
                              ^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\smmintrin.h(516,13): note: expanded from macro '_mm_blend_epi16'
  (__m128i) __builtin_ia32_pblendw128 ((__v8hi)(__m128i)(V1), \
            ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
[148/149] Building C object CMakeFiles\opus.dir\silk\x86\NSQ_del_dec_sse4_1.c.obj
FAILED: CMakeFiles/opus.dir/silk/x86/NSQ_del_dec_sse4_1.c.obj
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\bin\clang-cl.exe  /nologo -DENABLE_HARDENING -DHAVE_CONFIG_H -DHAVE_LRINT -DHAVE_LRINTF -DOPUS_BUILD -DOPUS_HAVE_RTCD -DOPUS_X86_MAY_HAVE_SSE -DOPUS_X86_MAY_HAVE_SSE2 -DOPUS_X86_MAY_HAVE_SSE4_1 -DVAR_ARRAYS -D_CRT_SECURE_NO_WARNINGS -IE:\opus\include -IE:\opus\build -IE:\opus -IE:\opus\celt -IE:\opus\silk -IE:\opus\silk\float /DWIN32 /D_WINDOWS /W3 /MDd /Zi /Ob0 /Od /RTC1 /GS /showIncludes /FoCMakeFiles\opus.dir\silk\x86\NSQ_del_dec_sse4_1.c.obj /FdCMakeFiles\opus.dir\opus.pdb -c -- E:\opus\silk\x86\NSQ_del_dec_sse4_1.c
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(409,18): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
    a_Q12_0123 = OP_CVTEPI16_EPI32_M64( a_Q12 );
                 ^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
 (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(410,18): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
    a_Q12_4567 = OP_CVTEPI16_EPI32_M64( a_Q12 + 4 );
                 ^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
 (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(413,22): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
        a_Q12_89AB = OP_CVTEPI16_EPI32_M64( a_Q12 + 8 );
                     ^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
 (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(414,22): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
        a_Q12_CDEF = OP_CVTEPI16_EPI32_M64( a_Q12 + 12 );
                     ^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
 (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(418,22): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
        b_Q12_0123 = OP_CVTEPI16_EPI32_M64( b_Q14 );
                     ^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
 (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(433,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                tmpa                = _mm_mul_epi32( pred_lag_ptr_tmp, b_Q12_0123 );
                                      ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(437,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                pred_lag_ptr_tmp    = _mm_mul_epi32( pred_lag_ptr_tmp, b_sr_Q12_0123 );
                                      ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(488,35): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                tmpa            = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_0123 );    /* 0, -1, -2, -3 * 0123 -> 0*0, 2*-2 */
                                  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(495,35): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                psLPC_Q14_tmp   = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_tmp ); /* 1*-1, 3*-3 */
                                  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(502,35): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                tmpa            = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_4567 );
                                  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(508,35): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                psLPC_Q14_tmp   = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_tmp );
                                  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(517,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                    tmpa            = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_89AB );
                                      ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(523,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                    psLPC_Q14_tmp   = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_tmp );
                                      ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(530,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                    tmpa            = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_CDEF );
                                      ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(536,39): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_noise_shape_quantizer_del_dec_sse4_1' that is compiled without support for 'sse4.1'
                    psLPC_Q14_tmp   = _mm_mul_epi32( psLPC_Q14_tmp, a_Q12_tmp );
                                      ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(820,24): error: always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_nsq_del_dec_scale_states_sse4_1' that is compiled without support for 'sse4.1'
        xmm_x16_x2x0 = OP_CVTEPI16_EPI32_M64( &(x16[ i ] ) );
                       ^
E:\opus\celt/x86/x86cpu.h(71,3): note: expanded from macro 'OP_CVTEPI16_EPI32_M64'
 (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x))))
  ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(825,24): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_nsq_del_dec_scale_states_sse4_1' that is compiled without support for 'sse4.1'
        xmm_x16_x2x0 = _mm_mul_epi32( xmm_x16_x2x0, xmm_inv_gain_Q26 );
                       ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(826,24): error: always_inline function '_mm_mul_epi32' requires target feature 'sse4.1', but would be inlined into function 'silk_nsq_del_dec_scale_states_sse4_1' that is compiled without support for 'sse4.1'
        xmm_x16_x3x1 = _mm_mul_epi32( xmm_x16_x3x1, xmm_inv_gain_Q26 );
                       ^
E:\opus\silk\x86\NSQ_del_dec_sse4_1.c(831,24): error: '__builtin_ia32_pblendw128' needs target feature sse4.1
        xmm_x16_x2x0 = _mm_blend_epi16( xmm_x16_x2x0, xmm_x16_x3x1, 0xCC );
                       ^
C:\PROGRA~1\MICROS~3\2022\COMMUN~1\VC\Tools\Llvm\lib\clang\12.0.0\include\smmintrin.h(516,13): note: expanded from macro '_mm_blend_epi16'
  (__m128i) __builtin_ia32_pblendw128 ((__v8hi)(__m128i)(V1), \
            ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
ninja: build stopped: subcommand failed.

After some investigation, I found the problem is that clang-cl also enables MSVC flag in cmake, so the checking support for SSE4.1 uses check_flag(SSE4_1 /arch:SSE2) # SSE2 and above (https://github.com/xiph/opus/blob/c9d5bea13e3cb7381bfa897a45d8bab4e7b767a7/cmake/OpusFunctions.cmake#L99), which is not good enough.

MSVC doesn't support to enable SSE4.1 seperately. It can only be enabled it by at least /arch:AVX, otherwise disabled. See https://docs.microsoft.com/en-us/cpp/build/reference/arch-x86?view=msvc-170. But clang supports more flexible options -msse4.1. So in this case, I'll recommend to use if (MSVC AND CMAKE_C_COMPILER_ID STREQUAL "MSVC") to decide whether use /arch:xx or -mxx.
I have tested it in my situation and it works well. Any idea? If it's ok, I can file a PR for this. Any idea?

One more question, from the compliation I saw the /arch:AVX or -mavx have been added to all files, which will make runtime detection no useful. Any reason to do that? Is it possible to move AVX optimization codes to seperate files so that we can only compile them with AVX, and don't run them when runtime detection found no AVX support?

Thanks!

xnorpx commented 2 years ago

Hi,

Clang-cl was never tested so not unexpected with bugs there.

I think you are correct regarding the file properties settings for MSVC is not set correctly, but need some more time refresh my memory. For runtime detection it should for sure only be applied on the corresponding files.

So in summary:

Feel free to file PR, filing it on gitlab will make things easier to merge.

xnorpx commented 2 years ago

So regarding SSE4.1 there is no auto-vectorization for MSVC, so there is no compiler flag needed for SSE4.1 source.

See (https://stackoverflow.com/a/64057905)

Regarding AVX in Opus there is no AVX optimizations currently, so either it's PRESUME_AVX which enabled it in all build or MAY_HAVE which is essentially no-op.

I am working on rewriting the intrinsic logic in cmake so will try to make more comments.

wangyoucao577 commented 2 years ago

I've sent email to webmaster@xiph.org to apply for gitlab forking permission for several days, but didn't get response so far. Could you please take a look so that I can fork and create PR on gitlab? Thanks!

xiphmont commented 2 years ago

Hiya, saw the mail, just behind.

No need to ask permission to fork! Once you have a PR, I'll be happy to look.

wangyoucao577 commented 2 years ago

image

Sorry.. but the fork button is grey, so I can't fork it.

wangyoucao577 commented 2 years ago

So regarding SSE4.1 there is no auto-vectorization for MSVC, so there is no compiler flag needed for SSE4.1 source.

See (https://stackoverflow.com/a/64057905)

Regarding AVX in Opus there is no AVX optimizations currently, so either it's PRESUME_AVX which enabled it in all build or MAY_HAVE which is essentially no-op.

I am working on rewriting the intrinsic logic in cmake so will try to make more comments.

Yes, MSVC doesn't have flags for SSE4.1, but MSVC compatible clang-cl does have.

wangyoucao577 commented 2 years ago

https://github.com/xiph/opus/pull/257
I've made PR here since I still can't fork/branch on gitlab. Please help review. If it's good to you, I can move it to gitlab once I have permission to do. Thanks!

abique commented 9 months ago

I'm also having compilations errors with opus + clang-cl. install-x64-windows-bitwig-rel-out.log