Closed hbiyik closed 5 months ago
Can MPV and Kodi handle DRM_PRIME (NV15) natively?
i think this is more of task for EGL interface (mesa), what both kodi and mpv does is receive an DRMPRimeFrame, and import this to EGL interface using attribs, so they should both support as long as mesa does, but current mesa-panfork does not, may be ~panfork~ panthor will do.
On the otherhand, kodi can bypass EGL and directly render on DRM planes thorugh kernel but thats always buggy, i dont use it.
Then the following options are available:
Then the following options are available:
- Modify MPV and Kodi to allow them to use RGA filters.
not maintainable
- Glue RGA and MPP_DEC together in FFmpeg.
i already tried, this works to certain extend with spagetti code
- Add proper RGA support in MPP libs, allowing MPP_DEC to output converted images, just like IEP de-interlacing.
i doubt this will ever happen
- Wait for panthor to support NV15.
i think this is the only reasonable, future-proof way forward at the moment. Meanwhile i can try to take advantage of HwFramesContext config
method to take advantage of NV12/NV16 DRM Prime frames
What i am trying to do in this PR is:
When you define the HWAccel
of the decoder with flag AV_CODEC_HW_CONFIG_METHOD_HW_DEVICE_CTX
, the client is responsible for creating creating the device and providing its context. Since RKMPP device is newly invented no client supports this.
However at least for decoder (not rga related things), RKMPP device is actually a DRM device its own quirks. There are 2 ways to accomplish the compliancy.
See .hwconfigs is actually an array, so one decoder can have multiple hwaccel device. So it is possible to add another hwaccel here but with .device_type
= AV_HWDEVICE_TYPE_DRM
. However i think FFMpeg has some issues there. Ie:
ff_get_format is actually only checking the first matched HWAccel but not the array and quits here. I think this is either a bug, or multiple hwaccel device is not a very common practice so i did not go this path.
When the decoder claims that it supports AV_CODEC_HW_CONFIG_METHOD_HW_FRAMES_CTX
method, the client does not necessarily need to give the device context, instead, codec should create its own hwdevice context, and the client will receive DRMPRIME frames over the hw_frames_ctx. This fits better to this decoder and this is what i am trying to implement.
ff_get_format is actually only checking the first matched HWAccel but not the array and quits here. I think this is either a bug, or multiple hwaccel device is not a very common practice so i did not go this path.
Generally speaking, in FFmpeg whenever a new hardware acceleration is added, a corresponding HW pix_fmt is also added, and frames_derive_to/from
is used to import/export to other underlying HW pix_fmt. So it is a rare usage when you want to add hwaccel for rkmppdec
but still borrow AV_PIX_FMT_DRM_PRIME.
Manually merged in 6584518
ooops, please do not merge yet, this hardly work as of now.
one question: https://github.com/nyanmisaka/ffmpeg-rockchip/commit/49cc93e8925eea1623cc6e2b0202a9a3225e11ec
is this a good idea? those headers are exported to the system and it will cause all drm based other software to be recompiled due to this change. basically an API change for drm interface here.
May be use one layer with a specific format like FOURCC("MBUF") and assign it to the some object index, and get the ptr from the mppbuffer? Still a hack but a non breaking one.
Or add the new data to the end of structure, so the old interface variables will not be repostioned
UPDATE:
it will not help to move to the end of AVDRMObjectDescriptor
,
new stuff should be at the end of AVDRMFrameDescriptor
, because it has multiple members of AVDRMObjectDescriptor
s, and the new oversized AVDRMObjectDescriptor
will overwrite the .nb_planes
member.
Even if the ptr
and opaque
would be a member of AVDRMObjectDescriptor
, it would only work when AVDRMObjectDescriptor
is not used as an array. But that unconventional, a cleaner way would be to use layers may be...
Yeah it’s indeed a breaking change for upstreaming. So, either append an new array after layers
to correspond to the objects
, or create a new format AV_PIX_FMT_RKMPP instead.
It would be nice if they could reserve some bits when defining AVDRMObjectDescriptor. It's not very intuitive to separate ptr and opaque from it.
with some code clean up, there will be no need for MppBuffer object, then only thing necessary is to check if fd is mmapped or not (thats why u need ptr i think).
I remember there was a way to do this without any flags and just by mmap but i need to dig in little bit.
Particularly below part, does it really have to check desc->objects[i].ptr
?
static int rkmpp_map_frame(AVHWFramesContext *hwfc,
AVFrame *dst, const AVFrame *src, int flags)
{
....
av_assert0(desc->nb_objects <= AV_DRM_MAX_PLANES);
for (i = 0; i < desc->nb_objects; i++) {
if (desc->objects[i].ptr) {
addr = desc->objects[i].ptr;
map->unmap[i] = 0;
i think you are doing an optimization above to prevent unnecessary unmap and mmaps.
In that case may be we can introduce a format modifier to flag the descriptor as mmapped or not.
ok i created a seperate Pr for this, i think this is the cleaniest way, no hacks involved.
this works pretty fine. here is the below command to get mpv working:
--vo=gpu-next
is optional but it seems it works rather better
--swapchain-depth
is optional but helps to sync the buffers better
simple drm_prime output:
mpv --swapchain-depth=8 --vo=gpu-next --hwdec=rkmpp pathtofile
avg hevc performance: 8bit: fps= 8k@30fps 10bit: not supported
AFBC->non afbc (10bit to 8bit)
mpv --swapchain-depth=8 --vo=gpu-next --vd-lavc-o=buf_mode=ext,afbc=on --vf=scale_rkrga=format=nv12 --hwdec=rkmpp pathtofile
avg hevc performance: 8bit afbc -> 8bit: fps= 8k@60fps 10bit afbc->8bit: 8k@60fps
AFBC->non afbc (HDR to HDR)
mpv --swapchain-depth=8 --vo=gpu-next --vd-lavc-o=buf_mode=ext,afbc=on --vf=scale_rkrga=format=p010 --hwdec=rkmpp pathtofile
avg hevc performance: 10bit afbc-> 10bit: 8k@30fps
mpv requires below path to play P010 drm frames.
diff --git a/video/out/hwdec/hwdec_drmprime.c b/video/out/hwdec/hwdec_drmprime.c
index 290f11c..9c63ab4 100644
--- a/video/out/hwdec/hwdec_drmprime.c
+++ b/video/out/hwdec/hwdec_drmprime.c
@@ -119,6 +119,7 @@
MP_TARRAY_APPEND(p, p->formats, num_formats, IMGFMT_NV12);
MP_TARRAY_APPEND(p, p->formats, num_formats, IMGFMT_420P);
MP_TARRAY_APPEND(p, p->formats, num_formats, pixfmt2imgfmt(AV_PIX_FMT_NV16));
+ MP_TARRAY_APPEND(p, p->formats, num_formats, pixfmt2imgfmt(AV_PIX_FMT_P010));
MP_TARRAY_APPEND(p, p->formats, num_formats, 0); // terminate it
p->hwctx.hw_imgfmt = IMGFMT_DRMPRIME;
may be it would be a good idea to add a format option like 8bit
for scale_rkrga so that it will auto select the correct 8bit variant of a 10bit frame variant, ie: NV15->NV12 and NV20->NV16
may be it would be a good idea to add a format option like 8bit for scale_rkrga so that it will auto select the correct 8bit variant of a 10bit frame variant, ie: NV15->NV12 and NV20->NV16
The swscale format filter of ffmpeg can accept multiple pixel formats and choose the appropriate one, like this format=nv12|nv16
. So it is possible to modify the rkrga filter options and parse them manually.
fmt1|fmt2
looks cleaner.
One another thing, mesa EGL requires 32 byte aligned frames however decoder outputs 16byte aligned frames and this causes drm prime frames not to be imported properly and the screen outputs green or blue.
Is it possible to add a hstride
option to scale_rkrga filter? so that after rga chain the frame could be imported to EGL properly.
Is it possible to add a hstride option to scale_rkrga filter? so that after rga chain the frame could be imported to EGL properly.
Sure.
Will this bypass the panfork issue?
layer->planes[i].offset =
layer->planes[i-1].offset +
layer->planes[i-1].pitch * (FFALIGN(hwfc->height, 64) >> (i > 1 ? pixdesc->log2_chroma_h : 0));
No, i just tested it. Because having a different pitch value for the buffer will only change the buffer size and offsets, however, those initial values will be overwritten by mpp like here.
To my knowledge mpp does not support preconfigured strides/pitches. If it would, may be setting the hstride of the mppframe before decode_get_frame by manually allocing the input mppframe would do the trickjust by the decoder. But thats just a theory, i was working it around with realigning to 64byte hstride with rga on next chain.
i think this is better to be handled by mpp, i asked them if they can support a decoding flow like this, may be thorugh some decccfg..
i think this is better to be handled by mpp, i asked them if they can support a decoding flow like this, may be thorugh some decccfg..
We really need this. For the H264 High10 use case, MPP is using hor_align=16 for both NV12 and NV15, while RGA3 requires hor_align=64 for NV15.
https://github.com/nyanmisaka/mpp/commit/a9869977a79863d1988afdfcba00f24200c38ffe
@nyanmisaka i tested this extensively, from my pov it is ready to be merged, there is no side affects until now i could see.
Merged in d0a1484
This is a first attempt to provide a way for mpp and kodi to work with the decoder.
With above changes mpv works, however there still seems to be copies involved it is slower than expected.
I have some ideas on whats going on but I will update when i have a proper understanding