ermig1979 / Simd

C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM.
http://ermig1979.github.io/Simd
MIT License
2.03k stars 406 forks source link

Memory issue with SimdSse41BgraToBgr.cpp, SimdAvx2BgraToBgr.cpp #193

Closed s-trinh closed 2 years ago

s-trinh commented 2 years ago

Hi,

We have hit some memory issues with color conversion, see: https://github.com/lagadic/visp/pull/990

Following changes https://github.com/lagadic/visp/pull/990/commits/ad646569c79e7bc645cb71347410cfaf2540d3ef should fix some issues.


But there are some other issues.

https://github.com/ermig1979/Simd/blob/dae44100d7313cf7c0c2adabe30a06135d403601/src/Simd/SimdSse41BgraToBgr.cpp#L32-L38

With bgra of size 17 (so align is false), we should have rather something like that:

image

Otherwise there are some faulty memory write:

Filters: RGB <==> RGBa conversion
==16723== Invalid write of size 8
==16723==    at 0x4A38D7F: _mm_storeu_si128 (emmintrin.h:727)
==16723==    by 0x4A38D7F: Store<false> (SimdStore.h:79)
==16723==    by 0x4A38D7F: BgraToBgrBody<false> (SimdSse41BgraToBgr.cpp:37)
==16723==    by 0x4A38D7F: void Simd::Sse41::BgraToBgr<false>(unsigned char const*, unsigned long, unsigned long, unsigned long, unsigned char*, unsigned long) (SimdSse41BgraToBgr.cpp:72)
==16723==    by 0x1542A1: ____C_A_T_C_H____T_E_S_T____24() (testColorConversion.cpp:374)
==16723==    by 0x1354DB: invoke (catch.hpp:14160)
==16723==    by 0x1354DB: Catch::RunContext::invokeActiveTestCase() (catch.hpp:13020)
==16723==    by 0x15D7CF: Catch::RunContext::runCurrentTest(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (catch.hpp:12993)
==16723==    by 0x171FE0: Catch::RunContext::runTest(Catch::TestCase const&) (catch.hpp:12754)
==16723==    by 0x174C73: execute (catch.hpp:13347)
==16723==    by 0x174C73: Catch::Session::runInternal() (catch.hpp:13553)
==16723==    by 0x1750A1: Catch::Session::run() (catch.hpp:13509)
==16723==    by 0x12E459: main (testColorConversion.cpp:1175)
==16723==  Address 0x1d58093c is 44 bytes inside a block of size 51 alloc'd

But unfortunately this fix does not work with bgra of size 32 (so align is true). Same issues with AVX2 code path.

ermig1979 commented 2 years ago

Hi! I added fix for Error in SSE4.1, AVX2, AVX-512BW optimizations of function BgraToRgba. Also I reproduced bug in function BgraToBgr. The fix will be later.

ermig1979 commented 2 years ago

I fixed Error in SSE4.1, AVX2 optimizations of function BgraToBgr. And similar bug in function BgraToRgb. I would be greate to independent check of it.

s-trinh commented 2 years ago

Thanks for the quick fix :+1: CI tests are now ok.