google / highway

Performance-portable, length-agnostic SIMD with runtime dispatch
Apache License 2.0
4.18k stars 317 forks source link

Problem with AVX2 4a47570 #195

Closed Jamaika1 closed 3 years ago

Jamaika1 commented 3 years ago
In file included from c:\msys1200\x86_64-w64-mingw32\include\lib\jxl\dec_xyb.h:25,
                 from dec_xyb.cc:15:
c:\msys1200\x86_64-w64-mingw32\include\lib\jxl\image.h: In function 'bool jxl::SameSize(const Image1&, const Image2&) [with Image1 = jxl::Rect; Image2 = jxl::Image3<float>]':
c:\msys1200\x86_64-w64-mingw32\include\lib\jxl\image.h:407:21: error: inlining failed in call to 'always_inline' 'size_t jxl::Image3<T>::xsize() const [with ComponentType = float]': target specific option mismatch
  407 |   JXL_INLINE size_t xsize() const { return planes_[0].xsize(); }
      |                     ^~~~~
jan-wassenberg commented 3 years ago

Hi @Jamaika1, thanks for reporting this. It seems to be specific to JPEG XL, we are looking into this but do not yet understand the cause.

Do you specify any compiler flags such as -mavx2?

Jamaika1 commented 3 years ago

Our codecs are on the doom9 forum. I also added open source. In version highway 2f14bd0 AVX2 compiled. I used the command:

g++.exe -std=gnu++11 -ftree-vectorize -g0 -O3 -fPIC -mmmx -msse -msse2 -msse3 -mssse3 -msse4 -msse4a -msse4.1 -msse4.2 -mavx -mavx2 -DWINVER=0x0602 -D_WIN32_WINNT=0x0602 -DJXL_DEBUG_WARNING=1 -DJXL_CRASH_ON_ERROR=1 -DJXL_DEBUG_ON_ALL_ERROR=1 -DJPEGXL_ENABLE_SKCMS=1 -DCMS_IS_WINDOWS_ -DCMS_RELY_ON_WINDOWS_STATIC_MUTEX_INIT -D__LITTLE_ENDIAN__ -DPNG_DEBUG=0 -DBITS_IN_JSAMPLE=8 -DINLINE="inline __attribute__((always_inline))" -DLOCAL(type)="static type" -DJPEGXL_MAJOR_VERSION=0 -DJPEGXL_MINOR_VERSION=3 -DJPEGXL_PATCH_VERSION=7 -DJPEGXL_ENABLE_JPEG=1 -DJPEGXL_ENABLE_GIF=1 -DJPEGXL_ENABLE_APNG=1 -DJPEGXL_ENABLE_EXR=1 -DJPEGXL_ENABLE_SJPEG=1 -DHWY_COMPILE_ONLY_STATIC -DJPEGXL_VERSION=\"0.3.7-30ea86ab\" -DHWY_STATIC_TARGET=HWY_AVX2 -DZLIB_WINAPI -DUSE_WINDOWS_MESSAGEBOX -DALIGN_SIZE=32 -c dec_xyb.cc -o dec_xyb.o
Jamaika1 commented 3 years ago

Sorry for the confusion. I downloaded latest GCC 12.0.0 on 5/19/2021 and the problem disappeared.

jan-wassenberg commented 3 years ago

@Jamaika1 thanks for sharing the command. Good to know the current GCC fixes this, but I'd still recommend removing the flags -mmmx -msse -msse2 -msse3 -mssse3 -msse4 -msse4a -msse4.1 -msse4.2 -mavx -mavx2. These are dangerous (can lead to such errors) and not necessary - Highway is able to generate x86 vector code even without them.

Jamaika1 commented 3 years ago

Again. For AVX2 not need mavx2

Edit: I have another question. Should I use mavx2 when jpegxl is in addition to metric butteraugli? The whole codec of liheif, libavif, libweb2 has mavx2 so jpegxl static without mavx2 in gcc doesn't work.

jan-wassenberg commented 3 years ago

@Jamaika1

so jpegxl static without mavx2 in gcc doesn't work.

That's surprising. If you remove -DHWY_STATIC_TARGET and -DHWY_COMPILE_ONLY_STATIC and -mavx2, we should still be able to use AVX2: see targets.h:253:

#if HWY_ARCH_X86
#define HWY_ATTAINABLE_TARGETS \
  HWY_ENABLED(HWY_SCALAR | HWY_SSE4 | HWY_AVX2 | HWY_AVX3)

Should I use mavx2 when jpegxl is in addition to metric butteraugli?

If we are talking about the Butteraugli from the JPEG XL repository, that one also does not need -mavx2, it uses the same Highway SIMD as JPEG XL.