HomeOfVapourSynthEvolution / VapourSynth-TCanny

TCanny filter for VapourSynth
GNU General Public License v3.0
27 stars 4 forks source link

malloc failure / alignment error on arm64 -> fix included #13

Closed Stefan-Olt closed 2 years ago

Stefan-Olt commented 2 years ago

I build TCanny on an Apple Silicon machine and I always got an error message malloc failure (blur) when I tried to use it. I noticed that in line 581 of TCanny.cpp the alignment is set to 4, and in case of arm it is not overwritten by SIMD selection.

I made a quick and dirty fix and set it to 8, problem solved: d->alignment = 8; Maybe you could either change to a fixed alignment or maybe sizeof(void*) could also work (not tested).

HolyWu commented 2 years ago

Please try the latest commit to see if it works for you, thanks.

Stefan-Olt commented 2 years ago

No, it doesn't. alignof(float) is 4, so it doesn't change anything

HolyWu commented 2 years ago

Okay. Please try again.

Stefan-Olt commented 2 years ago

That fixes it, thank you very much!

shssoichiro commented 2 years ago

I think there's still something wrong here. I was having this issue on x86_64 Arch Linux. With all optimizations disabled, I would get the "malloc failure (blur)" message, and with any optimizations enabled, it would return a blank plane for any plane that was passed in.

The latest commit fixes the C version, but the optimized versions still return blank planes. (this is with mode=-1 and float input)

HolyWu commented 2 years ago

Did you mean only mode=-1 give blank output? Have you tried the other modes and integer input?

shssoichiro commented 2 years ago

It seems to occur with any mode and any input format. Integer input formats result in pure green output and float input results in pure black output. But only when ASM is enabled (occurs with SSE2 or AVX2. I don't have AVX512 to test with.)

HolyWu commented 2 years ago

That's weird. What CPU and compiler are you using? Try recompiling a debug build with LTO disabled by meson build -Dbuildtype=debug -Db_lto=false.

shssoichiro commented 2 years ago

I've reproduced this on three machines, a Ryzen 3700X, a 4800H, and a 5950X, all running Arch Linux. My other Ryzen 3700X running Windows 11 does not have this issue. The Windows box uses a prebuilt library from vsrepo. The Linux boxes use the meson build which looks like it defaults to GCC (11.2). I've tried disabling all extra CFLAGS as well, with no effect. Same with the debug build command you posted.

shssoichiro commented 2 years ago

Interestingly, it seems to function correctly with assembly enabled if I compile the library using clang. So this seems to be something related to GCC specifically.

HolyWu commented 2 years ago

Yeah, I also reproduced it using MSYS2/mingw-w64 to build with GCC 11.2. With MSVC or Clang everything works fine.