fraunhoferhhi / vvenc

VVenC, the Fraunhofer Versatile Video Encoder
https://www.hhi.fraunhofer.de/en/departments/vca/technologies-and-solutions/h266-vvc.html
BSD 3-Clause Clear License
942 stars 170 forks source link

VVENC possibly overruns the stack with libc-musl #410

Closed AlexanderSchuetz97 closed 1 month ago

AlexanderSchuetz97 commented 2 months ago

I am compiling a static build of FFMPEG with vvenc on alpine linux. Alpine linux uses libc-musl. A notable difference between libc-musl and glibc is that the default stack size is significantly smaller. To test my theory I added the following to the linker arguments and recompiled ffmpeg and vvenc:

export LDFLAGS=-Wl,-z,stack-size=1048576

and then it no longer crashes and works as expected.

The call to ffmpeg that uses vvenc is:

/tmp/ffmpeg -y -v error -video_size 1280x720 -f rawvideo -pix_fmt bgr24 -i unix:///tmp/17438466013925160049.sock -f mp4 -vcodec vvc -r 30/1 -an /tmp/output16127279067242218599.bin

Its unlikely that the rawvideo decoder is the culprit here. Especially since if I use another codec like mpeg1video with the same call it does not crash. This leaves only vvenc as the possible culprit.

Does vvenc create threads where pthread_attr_setstacksize is not called? In this case libc musl allocates only between 80 and 128k for the stack. See https://wiki.musl-libc.org/functional-differences-from-glibc.html#Thread-stack-size This may not be enough for larger allocations.

adamjw24 commented 2 months ago

Very possible, we are not particularly strict about saving stack mem. Some alloca's are used here and there as well.

How crucial is it for your project? Any workaround you could use or is the only real solution to reduce actual stack usage?

AlexanderSchuetz97 commented 2 months ago

Well trying to get vvenc to work at all on alpine linux (presumeably with or without ffmpeg) requires setting the LDFLAGS as mentioned. The problem is that this reserves 10MB of stack for all threads in all of ffmpeg not just the threads that actually need that much stack. 10MB is coincidentally the amount that most versions of glibc choose for the stack size. I dont think you need that much stack, but since compiling ffmpeg takes a significant time and I do not quite know what determines your alloca's size I decided to stop experimenting and report this here.

libkvazaar (h265 encoder) had the same issue as vvenc. They solved it by calling pthread_attr_setstacksize with a size large enough for their alloca's. This is portable across all libc implementations as far as I know, so it should work with glibc/libc-musl and whatever else there is out there.

As mentioned the workaround with LDFLAGS works but this is not that "clean". I would prefer if the application set the required stack size to a reasonable amount and doesn't rely on the default behavior of the c-library. This would make vvenc more portable so its a good idea to do this regardless.

If you do not want to address this issue then I recommend at least documenting this somewhere as libc-musl is, as mentioned quite common in some container/docker environments.

Since I have a workaround I am under no time pressure.

adamjw24 commented 2 months ago

Thanks, we'll look into the pthread_attr_setstacksize function to ensure our code works independent of the underlying libc implementation.

K-os commented 1 month ago

@AlexanderSchuetz97: I implemented setting the stack size for the VVenC worker threads as you suggested in #411. Does that fix your issue?

There could in principle still be issues with the stack size limit on Alpine, since FFmpeg runs the "VVenC main" in a filter thread, which also uses the system's default stack size, but that thread and is out of our control and I didn't see any issues when I tested it.

AlexanderSchuetz97 commented 1 month ago

Thank you!

I will rebuild ffmpeg and run my unit tests again. I will give feedback once that is done.

One question, my unit tests test encoding with a 720p video 10s 24fps and not that large of a variety in colors. Does the size of your alloca's depend on the characteristics of the video?

If this depends on the input video (size/fps/length/pixel values) then could you supply me the worst case input video or close to that as a .raw (or any other video format)?

AlexanderSchuetz97 commented 1 month ago

The unit tests were successful. Unless the alloca's size depends on the input video then this issue is solved (see previous comment)