Closed tianjunwork closed 4 years ago
The hang issue can be 100% reproducible with command:
./SvtHevcEncApp -i bbb_1920x1080_420p.yuv -w 1920 -h 1080 -encMode 0 -intra-period 100 -scd 0 -rc 0 -q 22 -n 1000 -b 0.265
with some commits (such as the commit) before v1.3.0, after encoding 296 or 298 frames.
On current master tip(Disable OUT_ALLOC), encoder randomly(1-15th run) hang with only -encMode 0, not other encMode.
./SvtHevcEncApp -i ../../../yuv/bbb_1920x1080_420p.yuv -w 1920 -h 1080 -encMode 0 -n 1000
hang at 305th frame at 1-15th run.
./SvtHevcEncApp -i ../../../../yuv/bbb_1920x1080_420p.yuv -w 1920 -h 1080 -encMode 0 -intra-period 100 -n 1000
hang at 295th frame at 1-8th run. It always hang at 295th frame with -intra-period 100
.
And I can't reproduce hang with ./SvtHevcEncApp -i bbb_1920x1080_420p.yuv -w 1920 -h 1080 -intra-period 100 -scd 0 -rc 0 -q 22 -n 1000 -b 0.265
.
Did regression test on v1.3 with ./SvtHevcEncApp -i ../../../yuv/bbb_1920x1080_420p.yuv -w 1920 -h 1080 -encMode 0 -n 1000
. It is easier to reproduce on v1.3, nearly 100%.
Tested with 4k, it didn't hang.
Equivalent ffmpeg command also hang: ./ffmpeg -stream_loop -1 -video_size 1920x1080 -r 60 -i ../../yuv/bbb_1920x1080_420p.yuv -c:v libsvt_hevc -preset 0 -y 0.mp4
Root cause of this issue: The single thread SvtHevcEncApp (and other single thread encoding frameworks such as ffmpeg) calls EbH265EncSendPicture, EbH265GetPacket and EbH265ReleaseOutBuffer sequentially. But EbH265EncSendPicture and EbH265GetPacket are asynchronous, so that application would get the 1st available encoded packet after sending several frames. When EbH265EncSendPicture hangs due to exhausted Input FIFO, application couldn't release Output FIFO in time and the pending frames in the encoding pipeline are also blocked in the single thread InitialRateControl kernel where Output FIFO objects are dequeued. So the application and encoder are locked each other.
Root cause tracing details:
(BTW, moving the procedure of dequeuing Output FIFO objects from InitialRateControl to Packetization couldn't help resolve this issue. It also hangs when EncDec feeds back POC 262 to PictureManger, but the single thread PictureManger has already hung at dequeuing the (totally 18) Child PCS FIFO objects, and couldn't handle the posted POC 262 from EncDec.)
Options to address this issue:
Fixed by #523
Thanks for capturing this!
At first, this is another (fixed, not random) hang issue which hasn't been addressed by this PR, or isn't regression caused by this PR.
And here is the hang pattern due to encMode 0 and intra period:
Anyway, it's another hang issue we need to look at.
Originally posted by @Austin-Hu in https://github.com/OpenVisualCloud/SVT-HEVC/pull/475#issuecomment-597735296