intel / media-driver

Intel Graphics Media Driver to support hardware decode, encode and video processing.
https://github.com/intel/media-driver/wiki
Other
957 stars 342 forks source link

[ICL][ffmpeg-qsv/-vaapi] low-power (HW encoding) mode doesn't work #580

Closed eero-t closed 4 years ago

eero-t commented 5 years ago

Setup:

Use-case:

Expected outcome:

Actual outcome:

Same thing happens both with FFmpeg QSV and VAAPI backends, so this isn't MediaSDK related.

AFAIK GuC/HuC isn't enabled in drm-tip git kernel. But as AVC-LP works fine, I don't see why that would be a problem for HEVC-LP.

dmenshov commented 5 years ago

What is the rate control mode? If you try use BRC, then it is need HUC enabled in the kernel which is not available by default. So, on public kernel only CQP will work.

eero-t commented 5 years ago

"-qscale" option means CQP, FFmpeg "-low_power 1" option fails without it.

Issue seems to be size of the video, as low power actually "seems to" work with smaller resolution 8-bit input (*): ffmpeg -loglevel verbose -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -hwaccel_output_format vaapi -i 1280x720p_29.97_10mb_h264_cabac.264 -c:v hevc_vaapi -low_power 1 -qscale:v 20 -y test.h265

Both 4K 8-bit and 10-bit FullHD fail to encode with low power mode:

    Stream #0:0: Video: hevc (Main 10), 1 reference frame, yuv420p10le(tv), 1920x1080, 50 fps, 50 tbr, 1200k tbn, 50 tbc
Codec AVOption low_power (enable low power mode(experimental: many limitations by mfx version, BRC modes, etc.)) specified for output file #0 (test.h265) has not been used for any stream. The most likely reason is either wrong type (e.g. a video option with no video streams) or that it is a private option of some encoder which was not actually used for any stream.
...
    Stream #0:0: Video: hevc (Main), 1 reference frame, yuv420p(tv), 3840x2160, 60 fps, 60 tbr, 1200k tbn, 60 tbc
Codec AVOption low_power (enable low power mode(experimental: many limitations by mfx version, BRC modes, etc.)) specified for output file #0 (test.h265) has not been used for any stream. The most likely reason is either wrong type (e.g. a video option with no video streams) or that it is a private option of some encoder which was not actually used for any stream.

(*) While low-power mode does get enabled for smaller videos, those won't work either, as I get then GPU hang and encoding failure. Also with MediaSDK sample app:

sample_encode h265 -cqp -qsv-ff -i 1920x1080.yuv -w 1920 -h 1080 -o test.h265
libva info: VA-API version 1.5.0
libva info: va_getDriverName() returns 0
libva info: User requested driver 'iHD'
libva info: Trying to open /opt/install/Custom_4273/lib/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_5
libva info: va_openDriver() returns 0
Encoding Sample Version 8.3.26.4273

Input file format   YUV420
Output video        HEVC
Source picture:
    Resolution  1920x1088
    Crop X,Y,W,H    0,0,1920,1080
Destination picture:
    Resolution  1920x1088
    Crop X,Y,W,H    0,0,1920,1080
Frame rate  30.00
QPI 0
QPP 0
QPB 0
Gop size    0
Ref dist    0
Ref number  0
Idr Interval    0
Target usage    balanced
Memory type system
Media SDK impl      hw
Media SDK version   1.29

Processing started
GPU hang happened
ERROR: No free surfaces in pool (during long period)

[ERROR], sts=MFX_ERR_MEMORY_ALLOC(-4), Run, MSDK_INVALID_SURF_IDX==nEncSurfIdx error at /opt/builder/source1/media-sdk/samples/sample_encode/src/pipeline_encode.cpp:1959

[ERROR], sts=MFX_ERR_MEMORY_ALLOC(-4), main, pPipeline->Run failed at /opt/builder/source1/media-sdk/samples/sample_encode/src/sample_encode.cpp:1393
Frame number: 1
dmitryermilov commented 5 years ago

@eero-t, this issue with CQP mode is new to us. Thanks for bringing it. @dmenshov , would you like to investigate?

dmenshov commented 5 years ago

First for reproduce the issue i need to know version of libs which are used in the case. eero-t please could you write the version of the libs?

eero-t commented 5 years ago

As mentioned, latest from git when I was testing it, specifically:

Underlying system is Ubuntu 18.04.2 LTS (with all updates applied).

As can be seen from above example output, GPU hang happens already in first frame. It's fully reproducible, not random.

eero-t commented 5 years ago

HW is ICL-B4 with v3071 BIOS version (two weeks ago that was latest BKC for B4).

dmenshov commented 5 years ago

Could you share the full log of run case for ffmpeg (after run: ffmpeg -loglevel verbose -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -hwaccel_output_format vaapi -i 1280x720p_29.97_10mb_h264_cabac.264 -c:v hevc_vaapi -low_power 1 -qscale:v 20 -y test.h265)?

eero-t commented 5 years ago

Full FFmpeg ouput:

$ ffmpeg -loglevel verbose -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -hwaccel_output_format vaapi -i 1280x720p_29.97_10mb_h264_cabac.264 -c:v hevc_vaapi -low_power 1 -qscale:v 20 -y test.h265
ffmpeg version N-93476-g391f884675 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.3.0-27ubuntu1~18.04)
  configuration: --prefix=/opt/install/Nightly_1699 --enable-libmfx --enable-vaapi --enable-sdl2 --disable-libx265 --disable-libx264 --enable-libvpx --enable-libvorbis --enable-libopus --disable-libmp3lame --disable-libass --disable-sndio --enable-libfreetype --enable-gpl --disable-doc
  libavutil      56. 26.100 / 56. 26.100
  libavcodec     58. 47.106 / 58. 47.106
  libavformat    58. 26.101 / 58. 26.101
  libavdevice    58.  7.100 / 58.  7.100
  libavfilter     7. 48.100 /  7. 48.100
  libswscale      5.  4.100 /  5.  4.100
  libswresample   3.  4.100 /  3.  4.100
  libpostproc    55.  4.100 / 55.  4.100
[AVHWDeviceContext @ 0x55e8df170540] Opened VA display via DRM device /dev/dri/renderD128.
[AVHWDeviceContext @ 0x55e8df170540] libva: VA-API version 1.5.0
[AVHWDeviceContext @ 0x55e8df170540] libva: va_getDriverName() returns 0
[AVHWDeviceContext @ 0x55e8df170540] libva: User requested driver 'iHD'
[AVHWDeviceContext @ 0x55e8df170540] libva: Trying to open /opt/install/lib/dri/iHD_drv_video.so
[AVHWDeviceContext @ 0x55e8df170540] libva: Found init function __vaDriverInit_1_5
[AVHWDeviceContext @ 0x55e8df170540] libva: va_openDriver() returns 0
[AVHWDeviceContext @ 0x55e8df170540] Initialised VAAPI connection: version 1.5
[AVHWDeviceContext @ 0x55e8df170540] VAAPI driver: Intel iHD driver - 1.0.0.
[AVHWDeviceContext @ 0x55e8df170540] Driver not found in known nonstandard list, using standard behaviour.
[h264 @ 0x55e8df1ae840] Reinit context to 1280x720, pix_fmt: yuv420p
[h264 @ 0x55e8df1880c0] max_analyze_duration 5000000 reached at 5005000 microseconds st:0
Input #0, h264, from 'input/1280x720p_29.97_10mb_h264_cabac.264':
  Duration: N/A, bitrate: N/A
    Stream #0:0: Video: h264 (High), 1 reference frame, yuv420p(tv, bt709, progressive, left), 1280x720 [SAR 1:1 DAR 16:9], 29.97 fps, 29.97 tbr, 1200k tbn, 59.94 tbc
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> hevc (hevc_vaapi))
Press [q] to stop, [?] for help
[h264 @ 0x55e8df1b2a00] Reinit context to 1280x720, pix_fmt: vaapi_vld
[graph 0 input from stream 0:0 @ 0x55e8df1e36c0] w:1280 h:720 pixfmt:vaapi_vld tb:1/1200000 fr:30000/1001 sar:1/1 sws_param:flags=2
[hevc_vaapi @ 0x55e8df1d4800] Input surface format is nv12.
[hevc_vaapi @ 0x55e8df1d4800] Using VAAPI profile VAProfileHEVCMain (17).
[hevc_vaapi @ 0x55e8df1d4800] Using VAAPI entrypoint VAEntrypointEncSliceLP (8).
[hevc_vaapi @ 0x55e8df1d4800] Using VAAPI render target format YUV420 (0x1).
[hevc_vaapi @ 0x55e8df1d4800] RC mode: CQP.
[hevc_vaapi @ 0x55e8df1d4800] RC quality: 2360.
[hevc_vaapi @ 0x55e8df1d4800] RC framerate: 30000/1001 (29.97 fps).
[hevc_vaapi @ 0x55e8df1d4800] Using intra and P-frames (supported references: 3 / 0).
[hevc_vaapi @ 0x55e8df1d4800] All wanted packed headers available (wanted 0xd, found 0x1f).
[hevc_vaapi @ 0x55e8df1d4800] Using level 3.1.
Output #0, hevc, to 'test.h265':
  Metadata:
    encoder         : Lavf58.26.101
    Stream #0:0: Video: hevc (hevc_vaapi) (Main), 1 reference frame, vaapi_vld(left), 1280x720 [SAR 1:1 DAR 16:9], q=-1--1, 29.97 fps, 29.97 tbn, 29.97 tbc
    Metadata:
      encoder         : Lavc58.47.106 hevc_vaapi
[hevc_vaapi @ 0x55e8df1d4800] Failed to end picture encode issue: 24 (internal encoding error).
[hevc_vaapi @ 0x55e8df1d4800] Encode failed: -5.
Video encoding failed
[AVIOContext @ 0x55e8df247a40] Statistics: 0 seeks, 0 writeouts
[AVIOContext @ 0x55e8df1b5d00] Statistics: 3833856 bytes read, 0 seeks
Conversion failed!

After which there's in dmesg:

[  258.077061] i915 0000:00:02.0: GPU HANG: ecode 11:4:0x00000000, in ffmpeg [1743], hang on vcs0
[  258.077062] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[  258.077063] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[  258.077063] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[  258.077064] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[  258.077064] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[  258.078330] i915 0000:00:02.0: Resetting vcs0 for hang on vcs0
dmenshov commented 5 years ago

I observe the same: ./ffmpeg -loglevel verbose -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -hwaccel_output_format vaapi -i input.h264 -c:v hevc_vaapi -low_power 1 -qscale:v 20 -y test.h265 ffmpeg version N-93322-gd227ed5 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 6.3.0 (GCC) configuration: --disable-static --enable-shared --enable-libdrm --enable-vaapi --enable-libmfx --disable-amf --disable-audiotoolbox --disable-cuda --disable-cuda-sdk --disable-cuvid --disable-d3d11va --disable-dxva2 --disable-libnpp --disable-mmal --disable-nvdec --disable-nvenc --disable-omx --disable-omx-rpi --disable-rkmpp --disable-v4l2-m2m --disable-vdpau --disable-videotoolbox libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 47.103 / 58. 47.103 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 6.101 / 58. 6.101 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 [AVHWDeviceContext @ 0x18e4540] Opened VA display via DRM device /dev/dri/renderD128. [AVHWDeviceContext @ 0x18e4540] libva: VA-API version 1.5.0 [AVHWDeviceContext @ 0x18e4540] libva: va_getDriverName() returns 0 [AVHWDeviceContext @ 0x18e4540] libva: User requested driver 'iHD' [AVHWDeviceContext @ 0x18e4540] libva: Trying to open /opt/intel/mediasdk/lib64/iHD_drv_video.so [AVHWDeviceContext @ 0x18e4540] libva: Found init function __vaDriverInit_1_4 [AVHWDeviceContext @ 0x18e4540] libva: va_openDriver() returns 0 [AVHWDeviceContext @ 0x18e4540] Initialised VAAPI connection: version 1.5 [AVHWDeviceContext @ 0x18e4540] VAAPI driver: Intel iHD driver - 2.0.0. [AVHWDeviceContext @ 0x18e4540] Driver not found in known nonstandard list, using standard behaviour. [h264 @ 0x1924c40] Reinit context to 1280x720, pix_fmt: yuv420p [h264 @ 0x1923d00] max_analyze_duration 5000000 reached at 5000000 microseconds st:0 Input #0, h264, from 'input.h264': Duration: N/A, bitrate: N/A Stream #0:0: Video: h264 (High), 1 reference frame, yuv420p(progressive, left), 1280x720 [SAR 1:1 DAR 16:9], 25 fps, 25 tbr, 1200k tbn, 50 tbc Stream mapping: Stream #0:0 -> #0:0 (h264 (native) -> hevc (hevc_vaapi)) Press [q] to stop, [?] for help [h264 @ 0x1929fc0] Reinit context to 1280x720, pix_fmt: vaapi_vld [graph 0 input from stream 0:0 @ 0x194e400] w:1280 h:720 pixfmt:vaapi_vld tb:1/1200000 fr:25/1 sar:1/1 sws_param:flags=2 [hevc_vaapi @ 0x192aa00] Input surface format is nv12. [hevc_vaapi @ 0x192aa00] Using VAAPI profile VAProfileHEVCMain (17). [hevc_vaapi @ 0x192aa00] Using VAAPI entrypoint VAEntrypointEncSliceLP (8). [hevc_vaapi @ 0x192aa00] Using VAAPI render target format YUV420 (0x1). [hevc_vaapi @ 0x192aa00] RC mode: CQP. [hevc_vaapi @ 0x192aa00] RC quality: 2360. [hevc_vaapi @ 0x192aa00] RC framerate: 25/1 (25.00 fps). [hevc_vaapi @ 0x192aa00] Using intra and P-frames (supported references: 3 / 0). [hevc_vaapi @ 0x192aa00] All wanted packed headers available (wanted 0xd, found 0x1f). [hevc_vaapi @ 0x192aa00] Using level 3.1. Output #0, hevc, to 'test.h265': Metadata: encoder : Lavf58.26.101 Stream #0:0: Video: hevc (hevc_vaapi) (Main), 1 reference frame, vaapi_vld(left), 1280x720 [SAR 1:1 DAR 16:9], q=-1--1, 25 fps, 25 tbn, 25 tbc Metadata: encoder : Lavc58.47.103 hevc_vaapi [hevc_vaapi @ 0x192aa00] Failed to end picture encode issue: 24 (internal encoding error). [hevc_vaapi @ 0x192aa00] Encode failed: -5. Video encoding failed [AVIOContext @ 0x192b280] Statistics: 0 seeks, 0 writeouts [AVIOContext @ 0x18fe580] Statistics: 589824 bytes read, 0 seeks Conversion failed!

dmenshov commented 5 years ago

I will try to debug it with my environment. I will report about my result today later.

dmenshov commented 5 years ago

Sorry the result is not the same. I will try it again with drm-tip.

dmenshov commented 5 years ago

I tried drm-tip. I observe the same issue. I will create issue on bugzilla

dmenshov commented 5 years ago

https://bugs.freedesktop.org/show_bug.cgi?id=110307

dmitryermilov commented 5 years ago

Reply from Tvrtko at bugzilla: According to my interpretation of the error state the execution hangs on command dw 0x73880080 (HEVC_VP9_RDOQ_STATE) submitted from a batch buffer.

IPEIR: 0x00000000 IPEHR: 0x34240010 INSTDONE: 0xbbffffff batch: [0x00000000_018fb000, 0x00000000_018ff000] BBADDR: 0x00000000_018fbf45

... 0x018fbf44: 0x73880080: 3D UNKNOWN: 3d_965 opcode = 0x7388 ...

Only problem is IPEIR claims execution is not in a batch buffer (bit 3 is not set) and IPEHR points to something different as well.

INSTDONE says VCS and VIN units are running. I don't know what is the latter. Is it consistent with the hanging command?

tursulin commented 5 years ago

Basically the error state suggests hanging command could be HEVC_VP9_RDOQ_STATE. Media domain knowledge is needed to think about possible reasons.

dmitryermilov commented 5 years ago

It was confirmed that HEVC-LP needs HuC supports (including CQP). So the fact that HEVC-LP doesn't work on drm-tip is expected. However of course MSDK/UMD should return an error at initialization for this case.

eero-t commented 5 years ago

Will HEVC-LP work on ICL also with 4K resolution and 10-bit when there's HuC? Or is it still limited to 8-bit at smaller resolutions?

Xiaogangli-intel commented 5 years ago

Hi Eero, Currently, GuC and HuC firmware of ICL is not ready in drm-tip, so HEVC/VP9 low power cases will not work! GuC/HuC up-streaming is WIP, will let you know once it ready.

Xiaogangli-intel commented 4 years ago

@eero-t GuC and HuC firmware of ICL is ready in drm-tip, please confirm it. Let me know if you still have questions.

eero-t commented 4 years ago

I don't have anymore ICL on which to test. May take weeks until there's a replacement.

Will HEVC-LP work on ICL also with 4K resolution and 10-bit when there's HuC? Or is it still limited to 8-bit at smaller resolutions?

Any answer to those, should these work?

eero-t commented 4 years ago

Did quick testing of everything discussed above with FFmpeg, monitored results with latest i-g-t intel_gpu_top and all seems to be working fine on ICL with latest drm-tip, HuC firmware and media stack.

PS. Only odd thing was VP9 using low-power (video engine) encoding even when low-power mode is not specified, only the other codecs. Note: with FFmpeg, VP9 works only with VAAPI backend, for some reason FFmpeg QSV backend doesn't support it...