ermig1979 / Simd

C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM.
http://ermig1979.github.io/Simd
MIT License
2.03k stars 406 forks source link

Segmentation fault on freeing memory of resized output image after SimdResizerRun for SSE4 #222

Closed robinvanemden closed 2 years ago

robinvanemden commented 2 years ago

On an Intel Atom processor that support SIMD instruction sets up to and including SSE4.2 I receive a segmentation fault on freeing the pointer to the destination pixels after doing SimdResizerRun():

uint8_t* resized_image = malloc(dst_width * dst_c & dst_width * height);
void *resizer = SimdResizerInit(size_width, size_height, dst_width, dst_height, dst_c, SimdResizeChannelByte, SimdResizeMethodNearest);
SimdResizerRun(resizer, src_image_pointer, src_width * dst_c, resized_image, dst_width * dst_c);
SimdRelease(resizer);
free(resized_image)   ---------> Segmentation fault (core dumped)

This seems to happen independent of the chosen resize method.

On processors supporting instruction sets up to and including AVX2 I did not encounter this issue.

ermig1979 commented 2 years ago

Hi! Thank you for bug report. I tried to reproduce it and receive nothing. It works well for SSE4. Could you give me values of following parameters: size_width, size_height, dst_width, dst_height, dst_c to better bug reproducing?

robinvanemden commented 2 years ago

Hi! Thanks for looking into this. I believe the source size was 1920x1080x3 and the destination 256x256x3. But I will double check and see if I can generate a minimal example tomorrow. That way I can also ascertain whether the issue is replicable outside of our larger application.

robinvanemden commented 2 years ago

Good morning! I will be adding a minimal example here later today. I was, however, already able to replicate the segfault within our application on an Intel Atom x5-z8350, with these io sizes (src_c = 3):

size_width, size_height, dst_width, dst_height, dst_c: 685, 685, 256, 256, 3

However, when using the following source sizes, there's no segfault on free:

size_width, size_height, dst_width, dst_height, dst_c: 360, 360, 256, 256, 3

robinvanemden commented 2 years ago

I just tested with some isolated replication code, and SIMD ran perfect for both scenario's on the Atom. So I now suspect there may be some weird issue with our own code here. I'll delve deeper into the issue and keep you posted though!

robinvanemden commented 2 years ago

We tried overallocating memory , which seemed to resolve the segfault on Atom. However, we then got into issues with Synet functions downstream. Does the SIMD lib (for SSE4) somehow take ownership of the source/dest mem? We do not encounter these issues for AVX or neon on armv7 / aarch64 though. SIMD works like a charm there! But since the SIMD code is now well integrated in our project I will remain under the assumption this Atom issue is a weird side-effect that has to do with our own code. Should we close this issue for now, as we have not yet been able to generate a clean minimal example that illustrates the issue? We could then re-open when we have a better grasp on the issue.

[* Should we maybe have used SimdAllocate instead of malloc? Still, on NEON & AVX, we're doing fine with malloc.]

ermig1979 commented 2 years ago

Hi! Thank you for so detail bug report. I added checking of buffer overrun and reproduce the bug. Earlier I used for memory allocation function SimdAllocate which add extra memory before and after the array that lead to the hidding of the bug. I try to fix it in the nearest time.

robinvanemden commented 2 years ago

Thanks for the update. Super that you have been able to reproduce the bug! Might the issue also apply to other SIMD CPU extensions, but occur much less frequently? Or is it indeed SSE4 specific?

ermig1979 commented 2 years ago

I found that this bug is actual for SSE41 and AVX2 versions of Resizer. I fixed this bug.

robinvanemden commented 2 years ago

We tested today, and the Resizer issue is indeed resolved. Thank you!