ermig1979 / Simd

C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM.
http://ermig1979.github.io/Simd
MIT License
2.03k stars 406 forks source link

Some functions in this library are slower than opencv4 5.5 #212

Open SheepKeeper1990 opened 2 years ago

SheepKeeper1990 commented 2 years ago

Hi, First of all, thank you for your contribution. But when I used this library, I tested Simd:: Resize(),Simd:: BgrToGray and so on. Some functions are 4-6 times slower than OpenCV4.5.5. (OS: Ubuntu18.04 CPU: i7-10750H 12cores)

Did I miss anything when I used it? Could you give a use case of image processing to make it faster than opencv's function. If most of them are slower than opencv, what are the advantages of this library. Sincerely look forward to your answer.

ermig1979 commented 2 years ago

Hi! Thank you for response. OpenCV uses all cpu cores by default. Simd specializes on single-thread performance (for example when resize is used in many threads).

SheepKeeper1990 commented 2 years ago

Hi! Thank you for response. Does the SIMD library support multithreaded computing?Could you show me a simple example or pseudo code structure. Thanks again.

SheepKeeper1990 commented 2 years ago

I tried to combine SIMD functions,such as Simd::ResizeBilinear() with OpenMP to achieve multi-core acceleration, but it didn't work. It takes 7.79ms to execute 10 times, while opencv4.5.5 is 0.73ms. How to improve this? omp_set_num_threads(12);

pragma omp parallel

{ Simd::ResizeBilinear(viewSrc, viewDst); }

ermig1979 commented 2 years ago

Hi! Unfortunatily this does not work such way. I have to rewrite code of implementation of `ResizeBilinear'. I will add this issue into my future development plans.

SheepKeeper1990 commented 2 years ago

Thank you again for your contribution。

MyronRodrigues-StreetDrone commented 4 months ago

Hi @ermig1979 Thank you for your contribution.

I have similar issues with debayering, its around 5 times slower than opencv debayering. I'm new to this but from what I understand opencv is using all my cpu cores for debayering vs Simd library uses a single thread for the operation? Any way if the above is true to increase performance or the library implementation has to change?

theoractice commented 3 weeks ago

For my use case (shrink small images to smaller ones), Simd:: Resize() works 3x faster when compared to opencv 4.8, So it depends.