OpenVisualCloud / SVT-HEVC

SVT HEVC encoder. Scalable Video Technology (SVT) is a software-based video coding technology that is highly optimized for Intel® Xeon® processors. Using the open source SVT-HEVC encoder, it is possible to spread video encoding processing across multiple Intel® Xeon® processors to achieve a real advantage of processing efficiency.
Other
516 stars 172 forks source link

Some threads are blocked on the waiting for semaphore after EB_SEND_END_OBJ called #499

Closed clayding closed 4 years ago

clayding commented 4 years ago

We found that sometimes one or more threads inside SVT-HEVC are not exited normally after DeInitEoncoder called the EB_SEND_END_OBJ () and may be blocked on the waiting for the semaphore inside EbGetEmptyObject (). So the FFmpeg pipeline did not finish normally when fflush_encoder was called. We tested with different SVT-HEVC branches or commits on by running the following command line. It indicates that all tested branches and commits have the same problem. If you want to reproduce the issue, it’s better to stop the pipeline before the frames with its number are less than 100 are processed. 20 -40 frames or less is the easiest condition to reproduce.

./ffmpeg -pix_fmt uyvy422 -s 3200x2400 -f rawvideo -i ~/dockershare/3dat_backup_20200115083039_10.yuv -c:v libsvt_hevc -b:v 15M -preset 6 -rc 1 -tune 1 -f flv -timestamp now -y test.flv

Here is the stack information debugged and printed before.

(gdb) bt
#0  0x00007ffff56116d6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x55569f0c62f0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  0x00007ffff56116d6 in do_futex_wait (sem=sem@entry=0x55569f0c62f0, abstime=0x0) at sem_waitcommon.c:111
#2  0x00007ffff56117c8 in __new_sem_wait_slow (sem=0x55569f0c62f0, abstime=0x0) at sem_waitcommon.c:181
#3  0x00007ffff5885239 in EbBlockOnSemaphore () at /usr/local/lib/libSvtHevcEnc.so.1
#4  0x00007ffff5884f13 in EbGetEmptyObject () at /usr/local/lib/libSvtHevcEnc.so.1
#5  0x00007ffff58407b6 in EbDeinitEncoder () at /usr/local/lib/libSvtHevcEnc.so.1
#6  0x0000555555677089 in eb_enc_close (avctx=<optimized out>) at libavcodec/libsvt_hevc.c:408
#7  0x000055555568fd84 in avcodec_close (avctx=avctx@entry=0x555558fde7c0) at libavcodec/utils.c:1085
#8  0x0000555555cac819 in avcodec_free_context (pavctx=0x555558fdc5e0) at libavcodec/options.c:178
#9  0x00005555556e5eb2 in ffmpeg_cleanup (ret=1) at fftools/ffmpeg.c:574
#10 0x00005555556d8cb1 in exit_program (ret=1) at fftools/cmdutils.c:139
#11 0x00005555556ee25d in flush_encoders () at fftools/ffmpeg.c:2013
#12 0x00005555556ee25d in transcode () at fftools/ffmpeg.c:4775
#13 0x00005555556ca42e in main (argc=41, argv=0x7fffffffdcd8) at fftools/ffmpeg.c:4962 

And thread-level debug:

(gdb) bt
#0  0x00007ffff56286d6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x5555dbb68eb0)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  0x00007ffff56286d6 in do_futex_wait (sem=sem@entry=0x5555dbb68eb0, abstime=0x0) at sem_waitcommon.c:111
#2  0x00007ffff56287c8 in __new_sem_wait_slow (sem=0x5555dbb68eb0, abstime=0x0) at sem_waitcommon.c:181
#3  0x00007ffff589c179 in EbBlockOnSemaphore () at /usr/local/lib/libSvtHevcEnc.so.1
#4  0x00007ffff589be53 in EbGetEmptyObject () at /usr/local/lib/libSvtHevcEnc.so.1
#5  0x00007ffff58880b3 in PictureDecisionKernel () at /usr/local/lib/libSvtHevcEnc.so.1
#6  0x00007ffff561f6db in start_thread (arg=0x7fff4a864700) at pthread_create.c:463
#7  0x00007ffff534888f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
1480c1 commented 4 years ago

Please block off your logs with triple backticks (`) to prevent github from referencing issues number 1 through 13

https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks

(gdb) bt
#0 0x00007ffff56116d6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x55569f0c62f0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1 0x00007ffff56116d6 in do_futex_wait (sem=sem@entry=0x55569f0c62f0, abstime=0x0) at sem_waitcommon.c:111
...
tianjunwork commented 4 years ago

Hi @clayding , thanks posting the issue. We will try to reproduce on our side.

tianjunwork commented 4 years ago

Hi @clayding , I run you command line(mp4 output) with ffmpeg 4.2 release + SVT-HEVC master(Disable OUT_ALLOC) for 500 times. Didn't see hang at the end of encoding. Would you like to try it again with our master?

clayding commented 4 years ago

Hi @tianjunwork, we found the root cause of that problem after did the work as you have done. A couple of weeks ago, we found that our FFmpeg pipeline worked with the old SVT-HEVC commit would be blocked if q pressed before SVT-HEVC initialized. We reproduced and captured it as a clip video. https://drive.google.com/file/d/106LeV9BBxAKgVKrCwf2bIIjk9IYY8s07/view?usp=sharing

Here is the old commit of SVT-HEVC we used.

commit b9ec0139cbb057b547d7d2b0de6ddde5509c614f (HEAD -> develop, origin/develop)
Author: Jing Sun <jing.a.sun@intel.com>
Date:   Tue Aug 20 16:02:49 2019 +0800

    Add the "-thread-count" parameter support

    Signed-off-by: Jing Sun <jing.a.sun@intel.com>

commit 32147c8c3c570337cf68aefe7fe18c2c9fd75b7b
Author: tszumski <48432374+tszumski@users.noreply.github.com>
Date:   Thu Aug 22 22:50:42 2019 +0200

    Fix regression of windows debug build (#335)

    Fix regression of PR #264 for windows debug build
    To be clear: compilation succeeds but encoding fails

    Signed-off-by: Tomasz Szumski <tomasz.szumski@intel.com>

And then we issued a patch that modified the position of the checking code of q key and committed it to our forked FFmpeg repo to solve the problem mentioned above.

diff --git a/fftools/ffmpeg.c b/fftools/ffmpeg.c
index 9efd67c207..aea08eacff 100644
--- a/fftools/ffmpeg.c
+++ b/fftools/ffmpeg.c
@@ -4695,6 +4695,13 @@ static int transcode_step(void)
         av_assert0(ost->source_index >= 0);
         ist = input_streams[ost->source_index];
     }
+    {
+        int64_t cur_time= av_gettime_relative();
+        /* if 'q' pressed, exits */
+        if (stdin_interaction)
+            if (check_keyboard_interaction(cur_time) < 0)
+                exit_program(1);
+    }

     ret = process_input(ist->file_index);
     if (ret == AVERROR(EAGAIN)) {
@@ -4739,11 +4746,6 @@ static int transcode(void)
     while (!received_sigterm) {
         int64_t cur_time= av_gettime_relative();

-        /* if 'q' pressed, exits */
-        if (stdin_interaction)
-            if (check_keyboard_interaction(cur_time) < 0)
-                break;
-
         /* check if there's any stream where output is still needed */
         if (!need_output()) {
             av_log(NULL, AV_LOG_VERBOSE, "No more output streams to write to, finishing.\n");

So it's the patch that brings this new problem. By now we tested FFmpeg, from which the problematic patch has been removed, with the latest master branch of SVT-HEVC. And all became fine.

Thanks for your help.

tianjunwork commented 4 years ago

@clayding Thank you for the update.