rigaya / NVEnc

NVENCによる高速エンコードの性能実験
https://rigaya34589.blog.fc2.com/blog-category-17.html
Other
1.06k stars 108 forks source link

when --dhdr10-info copy and subburn used together then nvencc abends #437

Closed Shanghai-Tom closed 1 year ago

Shanghai-Tom commented 1 year ago

->> when --dhdr10-info copy and subburn used together then nvencc crashes --dhdr10-info コピーとサブバーンを一緒に使用すると、nvencc がクラッシュします

NVEnc (x64) 7.02 (r2361) by rigaya, Oct 26 2022 16:41:43 (gcc 9.4.0/Linux)

Linux 5.15.0-46-generic #49~20.04.1-Ubuntu SMP Thu Aug 4 19:15:44 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

->> parameters

--video-metadata 1?language=eng --avsync forcecfr --avsw -c hevc --cqp 27 --lookahead 30
--output-depth 10 --output-thread 1 --cuda-schedule sync --dhdr10-info copy --video-metadata title="H265_HDR10+:10bit 3840x1600"
--colorprim bt2020 --colormatrix bt2020nc --colorrange limited --mv-precision Q-pel
--master-display "G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(10000000,1)" --max-cll 173,171 --transfer smpte2084 --audio-codec 1?ac3 --audio-stream 1?5.1 --audio-bitrate 1?640 --audio-samplerate 1?48000 --audio-metadata 1?language=eng --audio-metadata 1?title="AC-3 ; 6 " -u P7 --vpp-subburn filename="./S01E01 - A Shadow of the Past.for.srt",charcode=utf-8

->> from Mediainfo: HDR : SMPTE ST 2094 App 4, Version 1, HDR10+ Profile B compatible

->> Input video is :

->> Stream 2 is extracted to mov_text subrip so that it can be burned in

->> now execute nvencc

Input #0, matroska,webm, Metadata: encoder : libebml v1.4.0 + libmatroska v1.6.1 Duration: 01:05:47.49, start: 0.000000, bitrate: 18665 kb/s Input #0, srt, from './S01E01 - A Shadow of the Past.for.srt': Duration: N/A, bitrate: N/A Stream #0:0: Subtitle: subrip Shaper: FriBidi 0.19.7 (SIMPLE) HarfBuzz-ng 2.6.4 (COMPLEX) Using font provider fontconfig Max B frames are 0 frames. NVEnc (x64) 7.02 (r2361) by rigaya, Oct 26 2022 16:41:43 (gcc 9.4.0/Linux) OS Version Linux Mint 20.1 (5.15.0-46-generic) CPU AMD Ryzen Threadripper 2990WX 32-Core Processor (8C/64T) <<-- ??? GPU #0: NVIDIA GeForce GTX 1070 (1920 cores, 1784 MHz)[PCIe3x16][470.14] NVENC / CUDA NVENC API 11.1, CUDA 11.4, schedule mode: sync Input Buffers CUDA, 39 frames Input Info avsw: hevc(yv12(10bit))->p010 [AVX2], 3840x1600, 24/1 fps

->> I tried first with avhw, then avsw, both failed in the same way

AVSync forcecfr

Vpp Filters    copyHtoD
               cspconv(p010 -> yv12(16bit))
               subburn: ./S01E01 - A Shadow of the Past.for.srt, scale x1.00
               cspconv(yv12(16bit) -> p010)

Output Info H.265/HEVC main10 @ Level auto 3840x1600p 1:1 24.000fps (24/1fps) avwriter: hevc, #1:eac3/6ch:5.1 -> ac3/5.1/640kbps => mp4 Encoder Preset quality Rate Control CQP I:27 P:27 B:27 ChromaQPOffset cb:0 cr:0 Lookahead on, 30 frames, Adaptive I, B Insert GOP length 240 frames B frames 0 frames [ref mode: disabled] Ref frames 3 frames AQ off CU max / min auto / auto VUI matrix:bt2020nc,colorprim:bt2020,transfer:smpte2084,range:limited MasteringDisp G(0.265000 0.690000) B(0.150000 0.060000) R(0.680000 0.320000) WP(0.312700 0.329000) L(1000.000000 0.000100) MaxCLL/MaxFALL 173/171 Others mv:Q-pel repeat-headers Output #0, mp4, to '/TOM512F2FS/TV_Series/removed Metadata: encoding_tool : NVEnc (x64) 7.02 encoder : Lavf58.29.100 Stream #0:0: Video: hevc (Main 10) (hev1 / 0x31766568), yuv420p10le(bt2020nc/bt2020/smpte2084, progressive), 3840x1600 [SAR 1:1 DAR 12:5], q=2-31, 200 kb/s, 24 fps, 12288 tbn (default) Metadata: 1?language : eng title : H265_HDR10+:10bit 3840x1600 Stream #0:1(eng): Audio: ac3 (ac-3 / 0x332D6361), 48000 Hz, 5.1, fltp, 640 kb/s (default) Metadata: title : AC-3 ; 6 fontselect: (Arial, 400, 0) -> /usr/share/fonts/truetype/msttcorefonts/Arial.ttf, 0, ArialMT

->> Immediately after the fontselect message then nvencc crashes

nvencc[2019701]: segfault at 8 ip 000055dc101357ab sp 00007fff3ffc6330 error 4 in nvencc[55dc100cd000+472000]
Code: c0 10 e9 81 fb ff ff 48 89 f8 4c 29 c0 48 c1 f8 04 48 83 f8 02 74 59 48 83 f8 03 74 46 48 83 f8 01 0f 85 48 f3 ff ff 49 8b 00 <83> 78 08 03 0f 85 3b f3 ff ff 4c 39 c7 0f 85 54 fb ff ff e9 2d f3

Process 2019701 (nvencc) of user 1000 dumped core.

012#012Stack trace of thread 2019701:

012#0 0x000055dc101357ab _ZN9NVEncCore16NvEncEncodeFrameEP13_EncodeBufferilliRKSt6vectorISt10shared_ptrI12RGYFrameDataESaIS5_EE (nvencc + 0x9d7ab)

012#1 0x000055dc10137384 _ZN9NVEncCore6EncodeEv (nvencc + 0x9f384)

012#2 0x000055dc100fb1de main (nvencc + 0x631de)

012#3 0x00007fe6e073a083 __libc_start_main (libc.so.6 + 0x24083)

012#4 0x000055dc10100bfe _start (nvencc + 0x68bfe)

012#012Stack trace of thread 2019712:

012#0 0x00007fe6e0917376 futex_wait_cancelable (libpthread.so.0 + 0xf376)

012#1 0x00007fe6e1919996 n/a (libavcodec.so.58 + 0x623996)

012#2 0x00007fe6e0910609 start_thread (libpthread.so.0 + 0x8609)

012#3 0x00007fe6e0835133 __clone (libc.so.6 + 0x11f133)

012#012Stack trace of thread 2019706:

012#0 0x00007fe6e082899f __GI___poll (libc.so.6 + 0x11299f)

012#1 0x00007fe6e2d3b221 n/a (libcuda.so.1 + 0x32d221)

012#2 0x00007fe6e2c78c7a n/a (libcuda.so.1 + 0x26ac7a)

012#3 0x00007fe6e2d33576 n/a (libcuda.so.1 + 0x325576)

012#4 0x00007fe6e0910609 start_thread (libpthread.so.0 + 0x8609)

012#5 0x00007fe6e0835133 __clone (libc.so.6 + 0x11f133)

012#012Stack trace of thread 2019710:

012#0 0x00007fe6e0917376 futex_wait_cancelable (libpthread.so.0 + 0xf376)

012#1 0x00007fe6e1919996 n/a (libavcodec.so.58 + 0x623996)

012#2 0x00007fe6e0910609 start_thread (libpthread.so.0 + 0x8609)

012#3 0x00007fe6e0835133 __clone (libc.so.6 + 0x11f133)

012#012Stack trace of thread 2019734:

012#0 0x00007fe6e0917376 futex_wait_cancelable (libpthread.so.0 + 0xf376)

012#1 0x00007fe6e4204e30 _ZNSt18condition_variable4waitERSt11unique_lockISt5mutexE (libstdc++.so.6 + 0xd0e30)

012#2 0x000055dc10260b6b _Z19WaitForSingleObjectPvj (nvencc + 0x1c8b6b)

012#3 0x000055dc10267656 _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN13RGYConvertCSP3runEiPPvPPKviiiiiiPiEUlvE_EEEEE6_M_runEv (nvencc + 0x1cf656)

012#4 0x00007fe6e420ade4 n/a (libstdc++.so.6 + 0xd6de4)

012#5 0x00007fe6e0910609 start_thread (libpthread.so.0 + 0x8609)

012#6 0x00007fe6e0835133 __clone (libc.so.6 + 0x11f133)

012#012Stack trace of thread 2019708:

012#0 0x00007fe6e082899f __GI___poll (libc.so.6 + 0x11299f)

012#1 0x00007fe6e2d3b221 n/a (libcuda.so.1 + 0x32d221)

012#2 0x00007fe6e2c78c7a n/a (libcuda.so.1 + 0x26ac7a)

012#3 0x00007fe6e2d33576 n/a (libcuda.so.1 + 0x325576)

012#4 0x00007fe6e0910609 start_thread (libpthread.so.0 + 0x8609)

012#5 0x00007fe6e0835133 __clone (libc.so.6 + 0x11f133)

012#012Stack trace of thread 2019709:

012#0 0x00007fe6e09177d1 futex_abstimed_wait_cancelable (libpthread.so.0 + 0xf7d1)

012#1 0x00007fe6e2be7f4f n/a (libcuda.so.1 + 0x1d9f4f)

012#2 0x00007fe6e2d33576 n/a (libcuda.so.1 + 0x325576)

012#3 0x00007fe6e0910609 start_thread (libpthread.so.0 + 0x8609)

012#4 0x00007fe6e0835133 __clone (libc.so.6 + 0x11f133)

012#012Stack trace of thread 2019711:

012#0 0x00007fe6e0917376 futex_wait_cancelable (libpthread.so.0 + 0xf376)

012#1 0x00007fe6e1919996 n/a (libavcodec.so.58 + 0x623996)

012#2 0x00007fe6e0910609 start_thread (libpthread.so.0 + 0x8609)

012#3 0x00007fe6e0835133 __clone (libc.so.6 + 0x11f133)

012#012Stack trace of thread 2019718:

012#0 0x00007fe6e0917376 futex_wait_cancelable (libpthread.so.0 + 0xf376)

012#1 0x00007fe6e1919996 n/a (libavcodec.so.58 + 0x623996)

012#2 0x00007fe6e0910609 start_thread (libpthread.so.0 + 0x8609)

012#3 0x00007fe6e0835133 __clone (libc.so.6 + 0x11f133)

012#012Stack trace of thread 2019721:

012#0 0x00007fe6e1c0690c n/a (libavcodec.so.58 + 0x91090c)

Shanghai-Tom commented 1 year ago

I extracted the hdr10+ to a json file and used that as input (--dhdr10-info "S01E01 - A Shadow of the Past.json") and the execution was successful with subburn

hdr10+ を json ファイルに抽出し、それを入力 (--dhdr10-info "S01E01 - A Shadow of the Past.json") として使用すると、subburn で実行が成功しました

rigaya commented 1 year ago

Thank you for the detailed information of the error!

The stack trace helps me a lot, seems like null reference has occurred. I've made a commit 701be03, which might be able to avoid this crush. However, it still might have another error (or dhdr10-info not copied properly) as I was not able to reproduce this error with my files, and not being able to figure out why null is set where actually should not be there.

I extracted the hdr10+ to a json file and used that as input (--dhdr10-info "S01E01 - A Shadow of the Past.json") and the execution was successful with subburn

Thank you for further testing, it seems like problem has occurred when copying dynamic HDR info from the input file.

I have one question, does this mean that the problem disappear when using "--dhdr10-info copy" alone (not using subburn)? It's a bit weird as subburn should not be related to --dhdr10-info copy...

Shanghai-Tom commented 1 year ago

Hello, Thank you for your prompt response and change to the code.

When using --dhdr10-info copy alone I have no problems with version 7.02, all the files completed successfully.

I have download the latest code and built NVEnc (x64) 7.03 (r2367) and the same file now converts successfully with --dhdr10-info copy and --subburn active

the issue with subburn is now fixed, but the output file now has no hdr10+ sidedata and is only recognised as HDR

The change is a regression because hdr10+ sidedata was in the file with the prior version 7.02 without using subburn

I have included the input file ffprobe sidedata report further down, and verify that it is the same as the hdr10plus_tool extracted json file

NVEnc (x64) 7.03 (r2367) by rigaya, Nov 7 2022 15:18:16 (gcc 9.4.0/Linux) .. .. VUI matrix:bt2020nc,colorprim:bt2020,transfer:smpte2084,range:limited MasteringDisp G(0.265000 0.690000) B(0.150000 0.060000) R(0.680000 0.320000) WP(0.312700 0.329000) L(1000.000000 0.000100) MaxCLL/MaxFALL 173/171 Dynamic HDR10 copy Others mv:Q-pel repeat-headers fontselect: (Arial, 400, 0) -> /usr/share/fonts/truetype/dejavu/DejaVuSans.ttf, 0, DejaVuSans

VUI matrix:bt2020nc,colorprim:bt2020,transfer:smpte2084,range:limited MasteringDisp G(0.265000 0.690000) B(0.150000 0.060000) R(0.680000 0.320000) WP(0.312700 0.329000) L(1000.000000 0.000100) MaxCLL/MaxFALL 173/171 Dynamic HDR10 ./S01E01 - A Shadow of the Past.json Others mv:Q-pel repeat-headers fontselect: (Arial, 400, 0) -> /usr/share/fonts/truetype/dejavu/DejaVuSans.ttf, 0, DejaVuSansize 1682.7MBB

In both tests the hdr10+ sidedata is not in the output file, but the hdr10+ frame data may be, I have no tool to verify it.

{ "JSONInfo": { "HDR10plusProfile": "B", "Version": "1.0" }, "SceneInfo": [ { "BezierCurveData": { "Anchors": [ 102, etc

ffprobe -hide_banner -loglevel error -select_streams v:0 -show_frames -read_intervals "%+#1" -show_entries "frame=color_space,color_primaries,color_transfer,color_range,side_data_list,pix_fmt" -of default=noprint_wrappers=1 [input file]

pix_fmt=yuv420p10le color_range=tv color_space=bt2020nc color_primaries=bt2020 color_transfer=smpte2084 side_data_type=Mastering display metadata red_x=34000/50000 red_y=16000/50000 green_x=13250/50000 green_y=34500/50000 blue_x=7500/50000 blue_y=3000/50000 white_point_x=15635/50000 white_point_y=16450/50000 min_luminance=1/10000 max_luminance=10000000/10000 side_data_type=Content light level metadata max_content=173 max_average=171 side_data_type=HDR Dynamic Metadata SMPTE2094-40 (HDR10+) application version=1 num_windows=1 targeted_system_display_maximum_luminance=400/1 maxscl=4/100000 maxscl=9/100000 maxscl=28/100000 average_maxrgb=0/100000 num_distribution_maxrgb_percentiles=9 distribution_maxrgb_percentage=1 distribution_maxrgb_percentile=0/100000 distribution_maxrgb_percentage=5 distribution_maxrgb_percentile=22/100000 distribution_maxrgb_percentage=10 distribution_maxrgb_percentile=100/100000 distribution_maxrgb_percentage=25 distribution_maxrgb_percentile=0/100000 distribution_maxrgb_percentage=50 distribution_maxrgb_percentile=0/100000 distribution_maxrgb_percentage=75 distribution_maxrgb_percentile=0/100000 distribution_maxrgb_percentage=90 distribution_maxrgb_percentile=0/100000 distribution_maxrgb_percentage=95 distribution_maxrgb_percentile=0/100000 distribution_maxrgb_percentage=99 distribution_maxrgb_percentile=20/100000 fraction_bright_pixels=0/1000 knee_point_x=0/4095 knee_point_y=0/4095 num_bezier_curve_anchors=9 bezier_curve_anchors=102/1023 bezier_curve_anchors=205/1023 bezier_curve_anchors=307/1023 bezier_curve_anchors=410/1023 bezier_curve_anchors=512/1023 bezier_curve_anchors=614/1023 bezier_curve_anchors=717/1023 bezier_curve_anchors=819/1023 bezier_curve_anchors=922/1023

Shanghai-Tom commented 1 year ago

I will perform a further matrix of tests, at the moment it appears that any conversion that reads the json file never produces hdr10+ sidedata in the file, this applies to 6.02, 7.02, and 7.03

I will update again in a couple of days with my findings, using multiple files in the case that the one that caused the crash is faulty in some way. Regards, Tom.

rigaya commented 1 year ago

Thank you for testing, and sharing the result.

The change is a regression because hdr10+ sidedata was in the file with the prior version 7.02 without using subburn

I've checked further and found out where dynamic hdr10 plus data was deleted when using vpp-subburn.

The new commit 49020a7 is the fix for this problem, and also reverts the previous commit 701be03, as it should not be required anymore.

Will you please have a test with the new commit?

json file

I forgot to notice you that --dhdr10-info from json requires hdr10plus_gen in the path. However, I'm sorry, I must say that I haven't tried this tool in Linux systems yet...

It should not be required if --dhdr10-info copy works out fine.

Shanghai-Tom commented 1 year ago

Thank you for the update, I will download the latest version, compile and test my 2 files

Shanghai-Tom commented 1 year ago

I did not re-test json file on version 7.03 r2368, but I did test subburn with lookahead and had no problems

All attempts to use a json file extracted using hdr10plus_tool failed on all versions because there is no hdr10plus_gen for linux yet,

I would be happy to assist with testing when a Linux version is available.

=========================================================================

These are testing results to show if hdr10+ side-data is in the output file

Two different hdr10+ files from two different sources, marked as F1 and F2

They were first copied with ffmpeg V5 to a new file, and ffmpeg indicated no errors, but checking is minimal on copy operations

The whole file was processed, typically a 45 minute TV episode, and is converted directly to an mp4 file with audio, and any burned in subtitles.

Additional ( English for me ) subtitles are SRT files for the video player use.

Note : V6.02 does not allow lookahead with --dhdr10-info, that case cannot be tested

HDR - File only has HDR side data [ Failure ]

HDR10+ - File has HDR10+ side-data [ Success ]


> .-------------------------------------.------------------.------------------------.
> |                   with lookahead    |  no lookahead    |  subburn no lookahead  |
> |-----------------.---------.---------.---------.--------.---------.--------------|
> |  dhdr10-info -> |  json   | copy    |  json   | copy   |  json   |  copy        |
> |-----------------+---------+---------+---------+--------+---------+--------------|
> | 6.02 r2191  F1  | no test | no test |  HDR    | HDR10+ |  HDR    |  Crashed     |
> |             F2  | no test | no test |  HDR    | HDR10+ |  HDR    |  Crashed     |
> |-----------------+---------+---------+---------+--------+---------+--------------|
> | 7.02 r2361  F1  | HDR     | HDR10+  |  HDR    | HDR10+ |  HDR    |  HDR         |
> |             F2  | HDR     | HDR10+  |  HDR    | HDR10+ |  HDR    |  HDR         |
> |-----------------+---------+---------+---------+--------+---------+--------------|
> | 7.03 r2367  F1  | HDR     | HDR10+  |  HDR    | HDR10+ |  HDR    |  HDR         |
> |             F1  | HDR     | HDR10+  |  HDR    | HDR10+ |  HDR    |  HDR         |
> |-----------------+---------+---------+---------+--------+---------+--------------|
> | 7.03 r2368  F1  | no test | HDR10+  | no test | HDR10+ | no test |  HDR10+      |
> |             F2  | no test | HDR10+  | no test | HDR10+ | no test |  HDR10+      |
> '-----------------'---------'---------'---------'--------'---------'--------------'
rigaya commented 1 year ago

Nice to hear it now works, and thank you for testing with many conditions.

The result is as expected, and it's glad to that 7.03 r2368 works fine with --dhdr10-info copy in every condition.

Shanghai-Tom commented 1 year ago

Thank You :-), closing the Ticket as solved.