CESNET / UltraGrid

UltraGrid low-latency audio and video network transmission system
http://www.ultragrid.cz
Other
489 stars 55 forks source link

SVT-HEVC broken with yuv444p10le #288

Closed alatteri closed 1 year ago

alatteri commented 1 year ago

Hello,

In master, trying to use pixfmt yuv444p10le with SVT-HEVC is now broken. It does work in builds just a few weeks ago. See below:

BROKEN:

./UltraGrid-continuous-x86_64.AppImage -t decklink:codec=R12L -c libavcodec:encoder=libsvt_hevc:preset=10:la_depth=2:qp=20:pred_struct=0:gop=24 --param lavc-use-codec=yuv444p10le 
UltraGrid 1.8+ (master rev 4c2dce3 built Feb  1 2023 10:43:57)

Display device   : none
Capture device   : decklink
Audio capture    : none
Audio playback   : none
MTU              : 1500 B
Video compression: libavcodec:encoder=libsvt_hevc:preset=10:la_depth=2:qp=20:pred_struct=0:gop=24
Audio codec      : PCM
Network protocol : UltraGrid RTP
Audio FEC        : none
Video FEC        : none

[1675303854.886] [Decklink capture] Using codec: R12L
[1675303854.887] [DeckLink capture] Using device UltraStudio 4K Mini
[1675303854.887] [DeckLink capture] bmdDeckLinkConfigCapturePassThroughMode set to: 1885628787
[1675303854.887] The desired display mode is supported: 1080p23.98
[1675303854.887] [DeckLink capture] Enable video input: 1080p23.98
[1675303854.887] [DeckLink] Trying to autodetect format.
[1675303854.942] Control socket listening on port 33349
[1675303854.975] Found empty UDP port pair 32768/32769
[1675303854.975] Connected IP version 6
    Last message repeated 1 times

[1675303854.992] Frame received (#0) - No input signal detected
[1675303854.992] [Decklink capture] Format change detected (display mode, color space - RGB444, 10bit).
[1675303854.992] [Decklink capture] Using codec: R12L
[1675303854.992] [DeckLink capture] Enable video input: 2160p24
[1675303855.096] Waiting for new frame timed out!
[1675303855.119] [Decklink capture] Format change detected (color space - RGB444, 10bit).
[1675303855.163] [lavc] Using codec: H.265, encoder: libsvt_hevc
[1675303855.172] [lavc] Setting bitrate to 31.8 Mbps.
[1675303855.172] [lavc] Trying pixfmt: yuv444p10le
SVT [version]:  SVT-HEVC Encoder Lib v1.5.1
SVT [build]  :  GCC 7.5.0    64 bit
LIB Build date: Jan 20 2023 15:30:52
-------------------------------------------
[1675303855.253] [lavc libsvt_hevc @ 0x7ff898000d00] Rext Profile forced for 422 or 444
Number of logical cores available: 16
Number of PPCS 56
Number of threads 48
------------------------------------------- 
SVT [config]: MainEXT Profile   Tier (auto) Level (auto)    
SVT [config]: EncoderMode / Tune                            : 10 / 1 
SVT [config]: EncoderBitDepth / CompressedTenBitFormat / EncoderColorFormat         : 10 / 0 / 3
SVT [config]: SourceWidth / SourceHeight / InterlacedVideo              : 3840 / 2160 / 0
SVT [config]: Fps_Numerator / Fps_Denominator / Gop Size / IntraRefreshType         : 24 / 1 / 24 / 0
SVT [config]: HierarchicalLevels / BaseLayerSwitchMode / PredStructure          : 3 / 0 / 0 
SVT [config]: BRC Mode / QP / LookaheadDistance / SceneChange               : CQP / 20 / 2 / 1 
SVT [config]: BitRateReduction / ImproveSharpness                   : 0 / 0 
SVT [config]: tileColumnCount / tileRowCount / tileSliceMode / Constraint MV        : 4 / 4 / 1 / 1
SVT [config]: De-blocking Filter / SAO Filter                       : 1 / 1 
SVT [config]: HME / UseDefaultHME                           : 1 / 1 
SVT [config]: MV Search Area Width / Height                         : 16 / 7 
SVT [config]: HRD / VBV MaxRate / BufSize / BufInit                 : 0 / 0 / 0 / 90
------------------------------------------- 
allocate memory failed, at /var/tmp/ffmpeg/SVT-HEVC/Source/Lib/Codec/EbSystemResourceManager.c, L121
[1675303855.799] [lavc libsvt_hevc @ 0x7ff898000d00] Failed to init encoder
[1675303855.861] [lavc] Could not open codec for pixel format yuv444p10le
[1675303855.861] [lavc] Codec supported pixel formats: yuv420p yuv420p10le yuv422p yuv422p10le yuv444p yuv444p10le
[1675303855.861] [lavc] Supported pixel formats: yuv444p10le
[1675303855.861] [lavc] Unable to find suitable pixel format for: R12L.
[1675303855.861] [lavc] Requested parameters not supported. Do not enforce encoder codec or use a supported one.
[1675303855.861] [lavc] Using codec: H.265, encoder: libsvt_hevc
[1675303855.862] [lavc] Setting bitrate to 31.8 Mbps.
[1675303855.862] [lavc] Trying pixfmt: yuv444p10le
SVT [version]:  SVT-HEVC Encoder Lib v1.5.1
SVT [build]  :  GCC 7.5.0    64 bit
LIB Build date: Jan 20 2023 15:30:52
-------------------------------------------

WORKING:

./UltraGrid-continuous-x86_64.AppImage -t decklink:codec=R12L -c libavcodec:encoder=libsvt_hevc:preset=10:la_depth=2:qp=20:pred_struct=0:gop=24 --param lavc-use-codec=yuv444p10le

UltraGrid 1.8+ (master rev 481cc66 built Jan 13 2023 16:12:37)

Display device   : none
Capture device   : decklink
Audio capture    : none
Audio playback   : none
MTU              : 1500 B
Video compression: libavcodec:encoder=libsvt_hevc:preset=10:la_depth=2:qp=20:pred_struct=0:gop=24
Audio codec      : PCM
Network protocol : UltraGrid RTP
Audio FEC        : none
Video FEC        : none

[Decklink capture] Using codec: R12L
[DeckLink capture] Using device DeckLink 4K Pro
[DeckLink capture] bmdDeckLinkConfigCapturePassThroughMode set to: 1885628787
The desired display mode is supported: 1080p23.98
[DeckLink capture] Enable video input: 1080p23.98
[DeckLink] Trying to autodetect format.
Control socket listening on port 45361
Frame received (#0) - No input signal detected
[Decklink capture] Format change detected (display mode, color space - RGB444, 10bit).
[Decklink capture] Using codec: R12L
[DeckLink capture] Enable video input: 2160p24
[Decklink capture] Format change detected (color space - RGB444, 10bit).
[lavc] Using codec: H.265, encoder: libsvt_hevc
[lavc] Setting bitrate to 31.8 Mbps.
SVT [version]:  SVT-HEVC Encoder Lib v1.5.1
SVT [build]  :  GCC 7.5.0    64 bit
LIB Build date: Jan  2 2023 11:22:11
-------------------------------------------
[lavc libsvt_hevc @ 0x7f1a80000d00] Rext Profile forced for 422 or 444
Number of logical cores available: 32
Number of PPCS 56
Number of threads 96
------------------------------------------- 
SVT [config]: MainEXT Profile   Tier (auto) Level (auto)    
SVT [config]: EncoderMode / Tune                            : 10 / 1 
SVT [config]: EncoderBitDepth / CompressedTenBitFormat / EncoderColorFormat         : 10 / 0 / 3
SVT [config]: SourceWidth / SourceHeight / InterlacedVideo              : 3840 / 2160 / 0
SVT [config]: Fps_Numerator / Fps_Denominator / Gop Size / IntraRefreshType         : 24 / 1 / 24 / 0
SVT [config]: HierarchicalLevels / BaseLayerSwitchMode / PredStructure          : 3 / 0 / 0 
SVT [config]: BRC Mode / QP / LookaheadDistance / SceneChange               : CQP / 20 / 2 / 1 
SVT [config]: BitRateReduction / ImproveSharpness                   : 0 / 0 
SVT [config]: tileColumnCount / tileRowCount / tileSliceMode / Constraint MV        : 4 / 4 / 1 / 1
SVT [config]: De-blocking Filter / SAO Filter                       : 1 / 1 
SVT [config]: HME / UseDefaultHME                           : 1 / 1 
SVT [config]: MV Search Area Width / Height                         : 16 / 7 
SVT [config]: HRD / VBV MaxRate / BufSize / BufInit                 : 0 / 0 / 0 / 90
------------------------------------------- 

SVT [WARNING] Elevated privileges required to run with real-time policies! Check Linux Best Known Configuration in User Guide to run application in real-time without elevated privileges!

[lavc] Selected pixfmt: yuv444p10le
[lavc] Selected pixfmt has not 4:2:0 subsampling, which is usually not supported by hw. decoders
[DeckLink capture] 103 frames in 5.01899 seconds = 20.5221 FPS
[DeckLink capture] 120 frames in 5.00007 seconds = 23.9996 FPS
[DeckLink capture] 120 frames in 5.00007 seconds = 23.9997 FPS
[DeckLink capture] 120 frames in 5.00002 seconds = 23.9999 FPS
alatteri commented 1 year ago

In addition, if I don't force --param lavc-use-codec=yuv444p10le, UG will chose yuv422p10le which is a less color accurate pixfmt.

./UltraGrid-continuous-x86_64.AppImage -t decklink:codec=R12L -c libavcodec:encoder=libsvt_hevc:preset=10:la_depth=2:qp=20:pred_struct=0:gop=24 10.55.121.22 
UltraGrid 1.8+ (master rev 4c2dce3 built Feb  1 2023 10:43:57)

Display device   : none
Capture device   : decklink
Audio capture    : none
Audio playback   : none
MTU              : 1500 B
Video compression: libavcodec:encoder=libsvt_hevc:preset=10:la_depth=2:qp=20:pred_struct=0:gop=24
Audio codec      : PCM
Network protocol : UltraGrid RTP
Audio FEC        : none
Video FEC        : none

[1675304014.543] [Decklink capture] Using codec: R12L
[1675304014.544] [DeckLink capture] Using device UltraStudio 4K Mini
[1675304014.544] [DeckLink capture] bmdDeckLinkConfigCapturePassThroughMode set to: 1885628787
[1675304014.544] The desired display mode is supported: 1080p23.98
[1675304014.544] [DeckLink capture] Enable video input: 1080p23.98
[1675304014.544] [DeckLink] Trying to autodetect format.

[1675304014.660] Frame received (#0) - No input signal detected
[1675304014.660] [Decklink capture] Format change detected (display mode, color space - RGB444, 10bit).
[1675304014.660] [Decklink capture] Using codec: R12L
[1675304014.660] [DeckLink capture] Enable video input: 2160p24
[1675304014.763] Waiting for new frame timed out!
[1675304014.787] [Decklink capture] Format change detected (color space - RGB444, 10bit).
[1675304014.832] [lavc] Using codec: H.265, encoder: libsvt_hevc
[1675304014.840] [lavc] Setting bitrate to 31.8 Mbps.
[1675304014.840] [lavc] Trying pixfmt: yuv444p10le
SVT [version]:  SVT-HEVC Encoder Lib v1.5.1
SVT [build]  :  GCC 7.5.0    64 bit
LIB Build date: Jan 20 2023 15:30:52
-------------------------------------------
[1675304014.921] [lavc libsvt_hevc @ 0x7f4180001240] Rext Profile forced for 422 or 444
Number of logical cores available: 16
Number of PPCS 56
Number of threads 48
------------------------------------------- 
SVT [config]: MainEXT Profile   Tier (auto) Level (auto)    
SVT [config]: EncoderMode / Tune                            : 10 / 1 
SVT [config]: EncoderBitDepth / CompressedTenBitFormat / EncoderColorFormat         : 10 / 0 / 3
SVT [config]: SourceWidth / SourceHeight / InterlacedVideo              : 3840 / 2160 / 0
SVT [config]: Fps_Numerator / Fps_Denominator / Gop Size / IntraRefreshType         : 24 / 1 / 24 / 0
SVT [config]: HierarchicalLevels / BaseLayerSwitchMode / PredStructure          : 3 / 0 / 0 
SVT [config]: BRC Mode / QP / LookaheadDistance / SceneChange               : CQP / 20 / 2 / 1 
SVT [config]: BitRateReduction / ImproveSharpness                   : 0 / 0 
SVT [config]: tileColumnCount / tileRowCount / tileSliceMode / Constraint MV        : 4 / 4 / 1 / 1
SVT [config]: De-blocking Filter / SAO Filter                       : 1 / 1 
SVT [config]: HME / UseDefaultHME                           : 1 / 1 
SVT [config]: MV Search Area Width / Height                         : 16 / 7 
SVT [config]: HRD / VBV MaxRate / BufSize / BufInit                 : 0 / 0 / 0 / 90
------------------------------------------- 
allocate memory failed, at /var/tmp/ffmpeg/SVT-HEVC/Source/Lib/Codec/EbSystemResourceManager.c, L121
[1675304015.467] [lavc libsvt_hevc @ 0x7f4180001240] Failed to init encoder
[1675304015.517] [lavc] Could not open codec for pixel format yuv444p10le
[1675304015.517] [lavc] Setting bitrate to 31.8 Mbps.
[1675304015.517] [lavc] Trying pixfmt: yuv422p10le
SVT [version]:  SVT-HEVC Encoder Lib v1.5.1
SVT [build]  :  GCC 7.5.0    64 bit
LIB Build date: Jan 20 2023 15:30:52
-------------------------------------------
[1675304015.521] [lavc libsvt_hevc @ 0x7f417bbc6bc0] Rext Profile forced for 422 or 444
Number of logical cores available: 16
Number of PPCS 56
Number of threads 48
------------------------------------------- 
SVT [config]: MainEXT Profile   Tier (auto) Level (auto)    
SVT [config]: EncoderMode / Tune                            : 10 / 1 
SVT [config]: EncoderBitDepth / CompressedTenBitFormat / EncoderColorFormat         : 10 / 0 / 2
SVT [config]: SourceWidth / SourceHeight / InterlacedVideo              : 3840 / 2160 / 0
SVT [config]: Fps_Numerator / Fps_Denominator / Gop Size / IntraRefreshType         : 24 / 1 / 24 / 0
SVT [config]: HierarchicalLevels / BaseLayerSwitchMode / PredStructure          : 3 / 0 / 0 
SVT [config]: BRC Mode / QP / LookaheadDistance / SceneChange               : CQP / 20 / 2 / 1 
SVT [config]: BitRateReduction / ImproveSharpness                   : 0 / 0 
SVT [config]: tileColumnCount / tileRowCount / tileSliceMode / Constraint MV        : 4 / 4 / 1 / 1
SVT [config]: De-blocking Filter / SAO Filter                       : 1 / 1 
SVT [config]: HME / UseDefaultHME                           : 1 / 1 
SVT [config]: MV Search Area Width / Height                         : 16 / 7 
SVT [config]: HRD / VBV MaxRate / BufSize / BufInit                 : 0 / 0 / 0 / 90
------------------------------------------- 

SVT [WARNING] Elevated privileges required to run with real-time policies! Check Linux Best Known Configuration in User Guide to run application in real-time without elevated privileges!

[1675304015.870] [lavc] Codec supported pixel formats: yuv420p yuv420p10le yuv422p yuv422p10le yuv444p yuv444p10le
[1675304015.870] [lavc] Supported pixel formats: gbrp12le rgb48le gbrp16le gbrp10le x2rgb10le rgb24 rgba bgra gbrp bgr0 yuv444p12le xv36le yuv444p16le yuv444p10le xv30le yuv422p10le yuv420p10le vaapi
[1675304015.870] [lavc] Codec libsvt_hevc capabilities: 0x00000020 using thread type 0, count 1
[1675304015.870] [lavc] Selected pixfmt: yuv422p10le
[1675304015.870] [lavc] Selected pixfmt has not 4:2:0 subsampling, which is usually not supported by hw. decoders
[1675304019.662] [DeckLink capture] 93 frames in 5.0317 seconds = 18.4828 FPS
[1675304024.662] [DeckLink capture] 120 frames in 5.00021 seconds = 23.999 FPS
Caught signal 2
MartinPulec commented 1 year ago

Thanks for the report, my remarks follow.

In addition, if I don't force --param lavc-use-codec=yuv444p10le, UG will chose yuv422p10le which is a less color accurate pixfmt.

If you look into the output, UltraGrid actually selects correctly yuv444p10le, but since its initialization fails, it tries next pixfmt in order (which doesn't take place if you enforce the pixfmt). yuv444p10le is second best.

Returning back to the main problem - I tested a version from 7th Jan '23 that should predate the version you use. But it behaved similarly to current version for me – with your parameters it worked on a machine with 128 GB ram + 465 GB swap. But not one with 16+4 GB - it did started when added 64 GB swap file, but it exhausted it relatively quickly (which is weird, because even on the 128 GB machine, the process takes "only" some 8 GB of resident memory).

Anyways, the two UG versions didn't seem to differ in SVT-HEVC behavior, the newer has just slightly different behavior - it doesn't use fallback pixfmt if one is enforced and if not, it selects better 2nd pixfmt. Also, lower resolution works so I believe that this is rather problem of SVT-HEVC allocating excessive amount of memory.

Could you try to run it on same machine? If we confirm that this is the root of the problem, we can look if there is any possibility to reduce the amount of allocated memory. Just FYI, the line on which the allocation in initialization fails is just a plain calloc().

alatteri commented 1 year ago

Wierd... because although the machine that SVT-HEVC works was different machine, AMD Ryzen 9 3950X, it too also only has 16GB of RAM. I'll dig up some more RAM for NUC12 and try again.

MartinPulec commented 1 year ago

AMD Ryzen 9 3950X, it too also only has 16GB of RAM.

I was thinking if the library doesn't have also some heuristics how much to allocate determined from eg. core count or something... But then it is weird that on a a machine with 128 GB RAM and 40 logical cores it consumes 8 GB of memory and on 16-core one 16 GB is not enough.

alatteri commented 1 year ago

I will try most recent continuous on known good machine, to make sure that it is not code, but wierdly NUC specific.

alatteri commented 1 year ago

I've upgraded memory in NUC12 to be 32GB. Something weird happens. It tries to run and then kills itself.

               total        used        free      shared  buff/cache   available
Mem:            30Gi       399Mi        30Gi        19Mi       140Mi        30Gi
./UltraGrid-continuous-x86_64.AppImage -t decklink:codec=R12L -c libavcodec:encoder=libsvt_hevc:preset=10:la_depth=2:qp=20:pred_struct=0:gop=24
UltraGrid 1.8+ (master rev c189087 built Feb  3 2023 15:34:10)
[1675459727.819] Frame received (#0) - No input signal detected
[1675459727.819] [Decklink capture] Format change detected (display mode, color space - RGB444, 10bit).
[1675459727.819] [Decklink capture] Using codec: R12L
[1675459727.819] [DeckLink capture] Enable video input: 2160p24
[1675459727.916] Waiting for new frame timed out!
[1675459727.920] [Decklink capture] Format change detected (color space - RGB444, 10bit).
[1675459728.021] [lavc] Using codec: H.265, encoder: libsvt_hevc
[1675459728.030] [lavc] Setting bitrate to 31.8 Mbps.
[1675459728.032] [lavc] Trying pixfmt: yuv444p10le
SVT [version]:  SVT-HEVC Encoder Lib v1.5.1
SVT [build]  :  GCC 7.5.0    64 bit
LIB Build date: Feb  3 2023 14:05:21
-------------------------------------------
[1675459728.075] [lavc libsvt_hevc @ 0x7fa860001240] Rext Profile forced for 422 or 444
Number of logical cores available: 16
Number of PPCS 56
Number of threads 48
------------------------------------------- 
SVT [config]: MainEXT Profile   Tier (auto) Level (auto)    
SVT [config]: EncoderMode / Tune                            : 10 / 1 
SVT [config]: EncoderBitDepth / CompressedTenBitFormat / EncoderColorFormat         : 10 / 0 / 3
SVT [config]: SourceWidth / SourceHeight / InterlacedVideo              : 3840 / 2160 / 0
SVT [config]: Fps_Numerator / Fps_Denominator / Gop Size / IntraRefreshType         : 24 / 1 / 24 / 0
SVT [config]: HierarchicalLevels / BaseLayerSwitchMode / PredStructure          : 3 / 0 / 0 
SVT [config]: BRC Mode / QP / LookaheadDistance / SceneChange               : CQP / 20 / 2 / 1 
SVT [config]: BitRateReduction / ImproveSharpness                   : 0 / 0 
SVT [config]: tileColumnCount / tileRowCount / tileSliceMode / Constraint MV        : 4 / 4 / 1 / 1
SVT [config]: De-blocking Filter / SAO Filter                       : 1 / 1 
SVT [config]: HME / UseDefaultHME                           : 1 / 1 
SVT [config]: MV Search Area Width / Height                         : 16 / 7 
SVT [config]: HRD / VBV MaxRate / BufSize / BufInit                 : 0 / 0 / 0 / 90
------------------------------------------- 
Killed
MartinPulec commented 1 year ago

I've done some more tests and it is certainly SVT-HEVC bug, It is reproducible also without UltraGrid:

ffmpeg -f lavfi -i smptebars=size=3840x2160 -pix_fmt yuv444p10le -t 1 -strict -1 in.y4m
SvtHevcEncApp -i in.y4m -b out.mp4

Runs until killed by OOM, consuming memory very quickly.

MartinPulec commented 1 year ago

I can also confirm that 32 GB RAM was sufficient on a different machine. I've already filled a bug report to library upstream and I'd conclude for now that it is not an UltraGrid bug.

alatteri commented 1 year ago

Agreed.. Funny thing is the 2 machines that are having issues are Intel. The AMD ones work fine.