Open geminixdev opened 1 year ago
Just to make that clear: the slowness is not visible when testing with arm64-v8a devices. Unfortunately all Android Boxes are armeabi-v7a, and there it makes them unusable for anything better than SD.
You said that QtMultimedia for 5.15 does not have perf issues? there are 2 ways to render from QtMM: https://github.com/qt/qtmultimedia/blob/5.15/src/qtmultimediaquicktools/qdeclarativevideooutput_render.cpp#L344
Uploads data to opengl textures, (and as I remember it is used by default for Android). And should be used with QtAVPlayer and its QVideoFrame
but also it can use window based renderer https://github.com/qt/qtmultimedia/blob/5.15/src/qtmultimediaquicktools/qdeclarativevideooutput_window.cpp#L88 which should be totally copy-free, and rendering is done directly to a window without any videoframes.
Also when you convert QAVVideoFrame to QVideoFrame there is hardcoded converting https://github.com/valbok/QtAVPlayer/blob/master/src/QtAVPlayer/qavvideoframe.cpp#L314
which downloads data from GPU, since there is no support for mediacodec in QRHI (as I remember)
Can you confirm that qml rendering is lagging? Using VideoOutput? Can you also disable sending video frames to VideoOutput, but received from the player and confirm that CPU is low and GUI is acting fast?
Also when you convert QAVVideoFrame to QVideoFrame there is hardcoded converting https://github.com/valbok/QtAVPlayer/blob/master/src/QtAVPlayer/qavvideoframe.cpp#L314
which downloads data from GPU, since there is no support for mediacodec in QRHI (as I remember)
Line 314:
result = convertTo(AV_PIX_FMT_YUV420P);
But in the context, in the lines before that, there is the check for AV_PIX_FMT_NV12, see below:
case AV_PIX_FMT_NV12:
format = VideoFrame::Format_NV12;
break;
default:
// TODO: Add more supported formats instead of converting
result = convertTo(AV_PIX_FMT_YUV420P);
format = VideoFrame::Format_YUV420P;
break;
and in my checks (at least with the development devices in arm64-v8a) the frames delivered from avcodec_receive_frame
are AV_PIX_FMT_NV12, and still are AV_PIX_FMT_NV12 when going into that QAVVideoFrame to QVideoFrame conversion and also out of that. That makes me thinking that there is no converion happening, thus no slowdown caused by that conversion.
I will rerun timing and pixel format tests with an armeabi-v7a device, to make sure that it is the same there.
Can you confirm that qml rendering is lagging? Using VideoOutput? Can you also disable sending video frames to VideoOutput, but received from the player and confirm that CPU is low and GUI is acting fast?
- Checking if decoding does not consume CPU and does not impact to GUI
- Checking if rendering is not efficient enough, it can be even checked without a player, just send many QVideoFrames to VideoOutput ?
One of my tests on an armeabi-v7a device was to comment out the videoSink->setVideoFrame() command, and that alone had absolutely no influence on the decoding time and mediacodec CPU, did not reduce them.
I will also rerun that, and may be drop the frames right after avcodec_receive_frame
.
You said that QtMultimedia for 5.15 does not have perf issues?
yes, definitely, Full HD no problem at all, smooth and GUI not lagging, on the same armeabi-v7a devices.
Especially that is irritating, as it contradicts the slow mediacodec decoding and the 100% CPU of it which I see in my tests on exactly the same devices.
Also when you convert QAVVideoFrame to QVideoFrame there is hardcoded converting https://github.com/valbok/QtAVPlayer/blob/master/src/QtAVPlayer/qavvideoframe.cpp#L314 which downloads data from GPU, since there is no support for mediacodec in QRHI (as I remember)
Line 314:
result = convertTo(AV_PIX_FMT_YUV420P);
But in the context, in the lines before that, there is the check for AV_PIX_FMT_NV12, see below:
case AV_PIX_FMT_NV12: format = VideoFrame::Format_NV12; break; default: // TODO: Add more supported formats instead of converting result = convertTo(AV_PIX_FMT_YUV420P); format = VideoFrame::Format_YUV420P; break;
and in my checks (at least with the development devices in arm64-v8a) the frames delivered from
avcodec_receive_frame
are AV_PIX_FMT_NV12, and still are AV_PIX_FMT_NV12 when going into that QAVVideoFrame to QVideoFrame conversion and also out of that. That makes me thinking that there is no converion happening, thus no slowdown caused by that conversion.I will rerun timing and pixel format tests with an armeabi-v7a device, to make sure that it is the same there.
A first test result, it confirms also on an armeabi-v7a device:
avcodec_receive_frame()
up to sending the frame to videosink with videoSink->setVideoFrame()
.result = convertTo(AV_PIX_FMT_YUV420P);
is not triggered to run, not executed.Tests are ongoing.
Can you confirm that qml rendering is lagging? Using VideoOutput? Can you also disable sending video frames to VideoOutput, but received from the player and confirm that CPU is low and GUI is acting fast?
- Checking if decoding does not consume CPU and does not impact to GUI
- Checking if rendering is not efficient enough, it can be even checked without a player, just send many QVideoFrames to VideoOutput ?
Also here a first test result:
Not executing videoSink->setVideoFrame() apparently does not change anything. Still the same high CPU for Mediacodec (99 - 100% for HD and Full HD) , still the the same overload of decoding with FullHD. (The unsmoothness of HD cannot get checked without actually seeing the pictures)
Tests are ongoing.
In QAVFrameCodec::decode()
there is avcodec_receive_frame()
. Logging the time it runs gives these results:
HD:
05-03 00:13:59.037 8197 8399 I Player : [03 0:13:59.036 +07 I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1280 x 720 frame->format: 23 frame->pts: 4338057600
05-03 00:13:59.073 8197 8399 I Player : [03 0:13:59.073 +07 I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1280 x 720 frame->format: 23 frame->pts: 4338061200
05-03 00:13:59.111 8197 8399 I Player : [03 0:13:59.111 +07 I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1280 x 720 frame->format: 23 frame->pts: 4338064800
This shows that the second avcodec_receive_frame()
took 37 msec, and the third 38 ms. These times are consistent, always more or less the same. Considering 40 ms available for25 fps we are at the limit here.
Full HD:
05-03 01:02:10.570 8197 8397 I Player : [03 1:02:10.570 +07 I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1920 x 1080 frame->format: 23 frame->pts: 4594467600
05-03 01:02:10.651 8197 8397 I Player : [03 1:02:10.651 +07 I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1920 x 1080 frame->format: 23 frame->pts: 4594471200
05-03 01:02:10.733 8197 8397 I Player : [03 1:02:10.733 +07 I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1920 x 1080 frame->format: 23 frame->pts: 4594474800
05-03 01:02:10.892 8197 8397 I Player : [03 1:02:10.892 +07 I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1920 x 1080 frame->format: 23 frame->pts: 4594478400
05-03 01:02:10.899 8197 8397 I Player : [03 1:02:10.898 +07 I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1920 x 1080 frame->format: 23 frame->pts: 4594482000
05-03 01:02:11.055 8197 8397 I Player : [03 1:02:11.055 +07 I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1920 x 1080 frame->format: 23 frame->pts: 4594485600
05-03 01:02:11.068 8197 8397 I Player : [03 1:02:11.067 +07 I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1920 x 1080 frame->format: 23 frame->pts: 4594489200
05-03 01:02:11.221 8197 8397 I Player : [03 1:02:11.221 +07 I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1920 x 1080 frame->format: 23 frame->pts: 4594492800
05-03 01:02:11.232 8197 8397 I Player : [03 1:02:11.231 +07 I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1920 x 1080 frame->format: 23 frame->pts: 4594496400
There is a big fluctuation, between 6 ms and 150 ms, with an average of about 80 ms. Far too long for the 40 ms we have at 25 fps.
So the question seems to be, why is it so slow? Possibly an option or something in the codec context which must be set or set differently?
The tests above where with an HK1 X4 Box. Now below tests with a Tanix TX6 Box. And it gets weirder, here
05-03 03:13:10.200 7868 22724 I Player : [03 3:13:10.200 MYT I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1920 x 1080 frame->format: 0 frame->pts: 7366644000
05-03 03:13:10.206 7868 22724 I Player : [03 3:13:10.206 MYT I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1920 x 1080 frame->format: 0 frame->pts: 7366647600
05-03 03:13:10.210 7868 22724 I Player : [03 3:13:10.210 MYT I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1920 x 1080 frame->format: 0 frame->pts: 7366651200
05-03 03:13:10.214 7868 22724 I Player : [03 3:13:10.214 MYT I] decode_mediacodec, after avcodec_receive_frame. frames.size() 1920 x 1080 frame->format: 0 frame->pts: 7366654800
Why is playback not smooth here for Full HD and HD (HD is almost OK)?
Possibly the high IOW of 23%
User 2%, System 1%, IOW 23%, IRQ 0%
User 287 + Nice 4 + Sys 166 + Idle 10063 + IOW 3327 + IRQ 0 + SIRQ 22 = 13869
PID USER PR NI CPU% S #THR VSS RSS PCY Name
7868 u0_a89 20 0 1% S 49 2154028K 662952K unk my.player
2006 mediacod 20 0 0% S 14 166156K 66288K unk media.codec
1859 system 12 -8 0% S 22 119036K 9612K unk /system/bin/surfaceflinger
28418 u0_a12 20 0 0% S 28 1265700K 95768K unk com.google.android.gms
1997 audioser 20 0 0% S 9 39648K 4816K unk /system/bin/audioserver
30208 root 20 0 0% R 1 4780K 1568K unk top
(This is still without executing videoSink->setVideoFrame()
)
- about NV12 , interesting, but AV_PIX_FMT_MEDIACODEC should be used, otherwise it looks like software decoding.
Yes, that is what I expected too. However we seem to get NV12 and YUV420P.
About possible software decoding, I specifically set 'h264_mediacodec' and logcat seems to confirm that its used.
- wondering if QAVPlayer itself consumes too much CPU
I assume that QAVPlayer as library is shown within the Player CPU%. According to top then its not too much.
- you say that pts diff between 2 frames are increasing?
Not the pts diff, that is always ok, 3600. With times I refer to the logged times at each log entry. [03 3:13:10.200 MYT I] and [03 3:13:10.206 MYT I] show 6 ms difference, thus that avcodec_receive_frame() needed 6 ms.
I will recheck with the first box, the HK1, if there is an error in logcat about h264_mediacodec.
I will recheck with the first box, the HK1, if there is an error in logcat about h264_mediacodec.
Logcat shown no indication that software decoding would be used. There are entries like that:
05-03 03:05:00.903 412 4104 D AmlogicVideoDecoderAwesome2: [22]"codecInit done"
05-03 03:05:00.903 412 4104 D AmlogicVideoDecoderAwesome2: [22]"mOutWidth is 1920 mOutHeight is 1080 mFlvFlag=0 mOutBufferCount is 10"
05-03 03:05:00.903 412 4104 D AmlogicVideoDecoderAwesome2: [22]"mOutBufferCount =10 mDecOutWidth 1920 mDecOutHeight 1088\n"
05-03 03:05:00.903 412 4104 D AmlogicVideoDecoderAwesome2: [22]"mIsNativeBuffers =0\n"
05-03 03:05:00.903 412 4104 D AmlogicVideoDecoderAwesome2: [22]"setUp mOutPortChanged=0\n"
05-03 03:05:00.903 412 4104 D AmlogicVideoDecoderAwesome2: [22]"use nv12\n"
05-03 03:05:00.903 412 4104 I OmxComponent: STATE_DONE: OMX_StateLoaded => OMX_StateIdle : OMX.amlogic.avc.decoder.awesome2
05-03 03:05:00.904 412 4104 I OmxComponent: OMX_CommandStateSet 850 Cmd 0 nParam1 0x3
05-03 03:05:00.904 412 4104 I OmxComponent: OMX-31 STATE_SET: OMX_StateIdle => OMX_StateExecuting : OMX.amlogic.avc.decoder.awesome2
05-03 03:05:00.904 412 4104 V AmlogicVideoDecoderAwesome2: [22]prepare:315 >
05-03 03:05:00.904 412 4104 I AmlogicVideoDecoderAwesome2: [22]"AllocDmaBuffers uvm mDecOutWidth:1920 mDecOutHeight:1088, 1920x1088"
which might explain why NV12 is used.
You should track Cpu of the application. There is simple way how to determine it is hw accelerated or not. Just compare Cpu usage of the app with mediacodec and without.
I assume that QAVPlayer as library is shown within the Player CPU%. According to top then its not too much.
Does it mean you see low CPU% but lags and delays in receiving,decoding frames?
I assume that QAVPlayer as library is shown within the Player CPU%. According to top then its not too much.
Does it mean you see low CPU% but lags and delays in receiving,decoding frames?
Yes, but in 2 very different scenarios for the 2 armeabi-v7a devices HK1 and TX6.
HK1 box playing
TX6 Box
Please note that avcodec_receive_frame() times are not decoding times. Due to the asynchronous process these are only the times avcodec_receive_frame() needs to return. Actual decoding might take longer.
You should track Cpu of the application. There is simple way how to determine it is hw accelerated or not. Just compare Cpu usage of the app with mediacodec and without.
Done, yes, it definitely was not software decoding. With software decoding activated the player CPU is higher, much higher, and mediacodec CPU is low, irrelevant.
Interestingly avcodec_receive_frame() returns immediately, 0 ms.
More testing with the HK1 Box resulted in the box playing with software decoding in Full HD better than with Mediacodec.
With mediacodec the decoding speed seems to be just half of what's needed, but with software decoding the decoding speed is just fast enough.
Display is smooth. despite the player CPU being much higher, about 135 %, there is no lagging, no microjumps, smooth.
That is most of the time, from time to time there seems to be some other activity on the box slowing it down, and then decoding gets too slow, some buffering, until the decoding did catch up, then smooth again for a while.
So on the HK1 box software decoding is much much much better than the stop and go of the decoding with mediacodec. Very clearly the problem is there with mediacodec.
On the other box, the TX6, software decoding is slower than Mediacodec. As expected. I don't see yet where exactly the problem is with that other box.
- Top shows 49% for the player and < 20 % for mediacodec. With software decoding activated the player CPU is higher, much higher, and mediacodec CPU is low, irrelevant.
Sorry, not clear here, what is the player and mediacodec *-)
Top should show CPU for entire process. And seems using h264_mediacodec
decreases CPU but there are some lags with frames on FullHD? Even if there is no any rendering involved yet.
So on the HK1 box software decoding is much much much better than the stop and go of the decoding with mediacodec. Very clearly the problem is there with mediacodec.
Trying to find any configuration settings that might help here, maybe need to increase num of threads or framerate or ...
I am not able to test myself right now, but it would be interesting to try some flags https://ffmpeg.org/doxygen/4.1/structAVCodecContext.html
Force low delay 887 #define AV_CODEC_FLAG_LOW_DELAY (1 << 19)
Here https://github.com/valbok/QtAVPlayer/blob/master/src/QtAVPlayer/qavcodec.cpp#L77 d->avctx->flags = AV_CODEC_FLAG_LOW_DELAY | AV_CODEC_FLAG2_FAST
also https://github.com/valbok/QtAVPlayer/blob/master/src/QtAVPlayer/qavcodec.cpp#L70 av_opt_set_int(d->avctx, "threads", 1???, 0);
- Top shows 49% for the player and < 20 % for mediacodec. With software decoding activated the player CPU is higher, much higher, and mediacodec CPU is low, irrelevant.
Sorry, not clear here, what is the player and mediacodec *-) Top should show CPU for entire process. And seems using
h264_mediacodec
decreases CPU but there are some lags with frames on FullHD? Even if there is no any rendering involved yet.
The "player" is the Player code built with QtAVPlayer. The CPU% shown for that includes whatever Player code is used and the linked in QtAVPlayer, together.
"mediacodec" is the CPU the h264_mediacodec is using. Although it should be in hardware, there seems to be CPU involved, I assume for moving the data in and out., or the OS part of the API controlling mediacodec.
On the HK1 box using mediacodec for decoding, it looks like this, as shown in top in 'adb shell' (the 9th column is the CPU %, irrelevant rows removed):
HK1 SD 720x576:
412 mediacodec 20 0 138M 45M 39M S 44.6 1.1 536:47.36 media.codec hw/android.hardware.media.omx@1.0-service
2602 u0_a99 10 -10 1.7G 374M 175M S 40.6 9.9 0:47.70 my.player
HK1 HD 1280x720:
412 mediacodec 20 0 154M 52M 46M S 99.0 1.3 537:41.57 media.codec hw/android.hardware.media.omx@1.0-service
2602 u0_a99 10 -10 1.6G 335M 185M S 44.3 8.9 1:27.03 my.player
HK1 FHD 1920x1080:
412 mediacodec 20 0 188M 69M 63M S 99.0 1.8 538:42.79 media.codec hw/android.hardware.media.omx@1.0-service
2602 u0_a99 10 -10 1.7G 393M 211M S 39.0 10.4 1:55.56 my.player
or shortened, with irrelevant columns removed, keeping only the CPU%:
HK1 SD 720x576:
412 mediacodec ... 44.6 ... media.codec hw/android.hardware.media.omx@1.0-service
2602 u0_a99 ... 40.6 ... my.player
HK1 HD 1280x720:
412 mediacodec ... 99.0 ... media.codec hw/android.hardware.media.omx@1.0-service
2602 u0_a99 ... 44.3 ... my.player
HK1 FHD 1920x1080:
412 mediacodec ... 99.0 ... media.codec hw/android.hardware.media.omx@1.0-service
2602 u0_a99 ... 39.0 ... my.player
So on the HK1 box software decoding is much much much better than the stop and go of the decoding with mediacodec. Very clearly the problem is there with mediacodec.
Trying to find any configuration settings that might help here, maybe need to increase num of threads or framerate or ...
I am not able to test myself right now, but it would be interesting to try some flags https://ffmpeg.org/doxygen/4.1/structAVCodecContext.html
Force low delay 887 #define AV_CODEC_FLAG_LOW_DELAY (1 << 19)
Here https://github.com/valbok/QtAVPlayer/blob/master/src/QtAVPlayer/qavcodec.cpp#L77 d->avctx->flags = AV_CODEC_FLAG_LOW_DELAY | AV_CODEC_FLAG2_FAST
also https://github.com/valbok/QtAVPlayer/blob/master/src/QtAVPlayer/qavcodec.cpp#L70
av_opt_set_int(d->avctx, "threads", 1???, 0);
Yes, thanks, I will test this!
av_opt_set_int(d->avctx, "threads", 2, 0);
and
d->avctx->flags = AV_CODEC_FLAG_LOW_DELAY | AV_CODEC_FLAG2_FAST;
did not produce a noticable change, unfortunately.
Interesting that it might mean that decoding itself is quite "slow", since there is no any rendering but it already consumes some time?
is explained here: https://speakerdeck.com/tmm1/video-decoding-with-ffmpeg-on-ios-and-android?slide=34 on slide 34 to 38, and here: http://mplayerhq.hu/pipermail/ffmpeg-devel/2016-March/191700.html which will result in a nice performance boost:
On a nexus 5, decoding an h264 stream (main profile) 1080p at 60fps:
- software output + rgba conversion goes at 59~60fps
- surface output + render on a surface goes at 100~110fps
And here is how Qt implements this in Qt 6: https://codereview.qt-project.org/c/qt/qtmultimedia/+/449591/2/src/plugins/multimedia/ffmpeg/qffmpeghwaccel_mediacodec.cpp
In summary, we have to tell Mediacodec to decode to a surface, and then render this surface directly.
So once again, very special process for Mediacodec decoding.
Interesting that it might mean that decoding itself is quite "slow", since there is no any rendering but it already consumes some time?
Based on my last post here my guess is that this box, the HK1 has a bug when Mediacodec outputs NV12 in higher resolutions. With HD it's still ok. There is another box, brand Ugoos, which seems to show the same behavior.
For the other box, the TX6, there it seems to be that the rendering of the YUV420P frames is in software, not very slow but at the limit for HD, and not fast enough to play anything better than HD smoothly.
I think both boxes might be very ok when that 'decoding to the surface which will get displayed directly' can get implemented. (This is based on seeing that they play full HD/1080p perfectly fine with QtMM.)
- Why the frames are received in NV12 and YUV420P, and not in AV_PIX_FMT_MEDIACODEC, and
- what to add to get the frames in AV_PIX_FMT_MEDIACODEC, or better said how to get an AV_PIX_FMT_MEDIACODEC surface, and
- how to render this surface
is explained here: https://speakerdeck.com/tmm1/video-decoding-with-ffmpeg-on-ios-and-android?slide=34 on slide 34 to 38, and here: http://mplayerhq.hu/pipermail/ffmpeg-devel/2016-March/191700.html
... In summary, we have to tell Mediacodec to decode to a surface, and then render this surface directly.
Following the recipe posted above, I seem to have it working to get avcodec_receive_frame() to produce frames in pixelformat AV_PIX_FMT_MEDIACODEC, and to render it to a surface. Missing is still to embed the surface somewhere, so it is visible. Nevertheless, no errors.
Super, could you please share how you integrated this to QAVPlayer? It would be needed to be placed inside https://github.com/valbok/QtAVPlayer/blob/master/src/QtAVPlayer/qavhwdevice_mediacodec.cpp
Super, could you please share how you integrated this to QAVPlayer? It would be needed to be placed inside https://github.com/valbok/QtAVPlayer/blob/master/src/QtAVPlayer/qavhwdevice_mediacodec.cpp
Yes, I will. To be sure it realy works I need to complete the last step somehow, to make the surface visible, in QML videooutout or equivalent. How to display an Android/View/Surface in a QML item. I'm working on that now, as time allows. I'm not an expert on that though, that was always the part Qt took care of.
Once I can see it and confirm that it works, then there is the code cleanup. Currently it's quick and dirty.
Yes, I have seen qavhwdevice_mediacodec.cpp as the best place for such code.
Update: I have spent days to try to get the rendering part to work. All code examples are however about displaying pictures stored in regular RAM to screen, using OpenGL. As described above, we need to display the texture which is already a texture (Android SurfaceTexture), and which mediacodec kann show in a surface directly, zero copy. So we need to make that surface visible, by having a QML Item / QML Videoutput using the same Surface/ SurfaceTexture. Any route through videoSink->setVideoframe() seem to be useless for that, as they do not expect an Android SurfaceTexture, and it would try to duplicate the rendering which mediacodec does already.
The good news is that Qt has programmed that, it's now in QtMM, but I think not released before Qt 6.5.1. It might not be accessible through the public API though, but only using private APIs.
Qt 6.5.1 will be released in about a week, then I can continue testing if it is possible to hook in there.
Alternatively:
I would need this great KDAB example How to create a zero-copy Android SurfaceTexture QML updated for Qt6. Unfortunately Qt changed all QSGSimple.. classes due to building on RHI instead of OpenGL. As that is not my expertise, I cannot do that (yet) myself. Otherwise I think that code does the same, and might avoid to use QtMM code. I would prefer that.
thanks you keep us updated =)
I have spent days to try to get the rendering part to work. All code examples are however about displaying pictures stored in regular RAM to screen, using OpenGL. As described above, we need to display the texture which is already a texture (Android SurfaceTexture), and which mediacodec kann show in a surface directly, zero copy. So we need to make that surface visible, by having a QML Item / QML Videoutput using the same Surface/ SurfaceTexture. Any route through videoSink->setVideoframe() seem to be useless for that, as they do not expect an Android SurfaceTexture, and it would try to duplicate the rendering which mediacodec does already.
I started to look at this and found that QtMM creates SurfaceTexture and after it is attached to created gl texture using https://developer.android.com/reference/android/graphics/SurfaceTexture#attachToGLContext(int).
And GL_TEXTURE_EXTERNAL_OES is important here. Trying to implement this for Qt5 and no luck yet, since renderers use GL_TEXTURE_2D. Maybe we can avoid Qt5 at all.
https://github.com/valbok/QtAVPlayer/pull/363 Only for Qt6
Could you please confirm that it works? And will close it.
363 Only for Qt6
Great, thanks, I'm happy to see this! I will be back on Android and testing this in the next days!
Sorry for the long delay! I finally managed to test these changes. I see that a lot has changed, not only the Android specific code.
All still works very well on on the arm8 devices, nice and smooth.
And excellent news also for the 2 armeabi-v7a devices I'm testing with, the HK1 and TX6. Both behave now normal, in all resolutions, no CPU overload. A huge difference to before!
The stuttering as seen before, when playing 720p or 1080p, is gone, as well as the funny behavior of the HK1 when decoding 1080p (as described above), all behaves normal.
It is almost perfect now!
So what is still not perfect? Playback is not 100% smooth, and that is in all resolutions. 1080p is not worse that 720p. Both have a slight 'hanging' of the picture, which is mostly noticable at regular movements. At chaotic movements or with not much moving, for example some people standing and talking, it is almost unnoticable.
The tool "top" shows for both devices in all resolutions no CPU overload at all, there is no reason visible in top (showing CPU for app and for mediacodec) why this would happen.
The only hint I found was logcat showing logspam like the lines below, apparently happening at every frame sent to display::
On the HK1 Box:
[SurfaceTexture-0-3799-0] bindTextureImage: clearing GL error: 0x500
On the TX6 Box (Android 7):
W GLConsumer: [SurfaceTexture-0-4973-0] bindTextureImage: clearing GL error: 0x500
This sounds like a shader issue, but then I would expect it to show no picture at all.
Possibly also with Android Mediacodec the texture should better not be pushed to display with videoSink->setVideoFrame(), but as Qt seems to do it in QtMM with just triggering an update of the shown Android surface texture, where the mediacodec has already placed the image when decoding.
Great news, could you also try to receive frames using Qt::DirectConnection? Qt6 changed the limitation and all frames could be delivered to the video sink on different threads now and might be possible that events queue hangs a bit?
Great news, could you also try to receive frames using Qt::DirectConnection? Qt6 changed the limitation and all frames could be delivered to the video sink on different threads now and might be possible that events queue hangs a bit?
Qt::DirectConnection to receive the frames did not help. Also there are crashes now around QVideoFrame (backtrace in logcat), which I haven't noticed before.
I measured the times between the videoSink->setVideoFrame() calls and they are as expected (always around 40ms at 25 fps content) and don't explain any hanging.
However when I measure the time between the VideoBuffer_MediaCodec::handle() calls, there where the display frame magic happens with the Android surface texture, they are between 30 ms and more than 50 ms. Occasionally even < 20 ms, and often > 50ms, up to 54 ms. This seems to be reflecting exactly the unsmoothness.
I would have expected them to be regularly, around 40ms, the same as the videoSink->setVideoFrame() calls which are triggering them. I assume the videosink code and the VideoBuffer_MediaCodec::handle() runs in the same thread as QML.
However when I measure the time between the VideoBuffer_MediaCodec::handle() calls, there where the display frame magic happens with the Android surface texture, they are between 30 ms and more than 50 ms. Occasionally even < 20 ms, and often > 50ms, up to 54 ms. This seems to be reflecting exactly the unsmoothness.
To see if there is something somewhere slowing down the eventloop to cause these time differences as seen in the VideoBuffer_MediaCodec::handle() method, I let the thread sleep every time to delay the texture update to a minimum of 37 ms. So whenever the delay to the previous run of handle() was shorter than that, the function had to sleep to compensate. And the result was that now the playing was much smoother, even on the old armeabi-v7a devices.
My interpretation is that the variable delay seen in the handle() function is caused somewhere in the QtMM code handling the videoSink->setVideoFrame() calls, or in the Android code involved.
Nevertheless for my application, using this dirty workaround of sleeping the handle() function, this is working sufficiently well now!!!
There is still the “bindTextureImage: clearing GL error: 0x500” spamming the logs on almost all devices. It seems to have no other negative effect though. Unless you have an idea of how this can get fixed, avoided, this issue can get closed.
I realize that it might not be an option for general use to add such a delay control in the handle function. I also have only tested with Qt 6.4.3, not yet with Qt 6.5.2. Possibly QtMM behaves differently there.
After comparing extensively the rendering smoothness of Qt 5 QtMM with this solution, it was obvious that the rendering of the GL_TEXTURE_EXTERNAL_OES frames was showing the unreliable timing on all devices, and never seems to play really smooth. So I ended up with this:
armeabi-v7a devices:
arm64-v8a devices
I might retest with Qt 6.5.2 later, but due the Qt's switch to cmake, and the unavailability of the QtAVPlayer module in Qt 6.5.2 qmake, I have to learn first how to use cmake for Qt on Android.
Thanks, it would still need to dive into issues with GL_TEXTURE_EXTERNAL_OES, interesting what happens there.
I might retest with Qt 6.5.2 later, but due the Qt's switch to cmake, and the unavailability of the QtAVPlayer module in Qt 6.5.2 qmake, I have to learn first how to use cmake for Qt on Android.
Started to think that no need any libs here, https://github.com/valbok/QtAVPlayer/issues/374 and may be it is easier to always statically build to an app, still not clear how to deal with configure options in that way but would like to consider never build QtAVPlayer as separate and always should be part of an app.
Started to think that no need any libs here, #374 and may be it is easier to always statically build to an app, still not clear how to deal with configure options in that way but would like to consider never build QtAVPlayer as separate and always should be part of an app.
Which would allow to continue to use qmake in Qt 6.5, right? Yes I like to try that!
Started to think that no need any libs here, #374 and may be it is easier to always statically build to an app, still not clear how to deal with configure options in that way but would like to consider never build QtAVPlayer as separate and always should be part of an app.
Which would allow to continue to use qmake in Qt 6.5, right? Yes I like to try that!
https://github.com/valbok/QtAVPlayer/pull/389 cmake support is totally removed
for arm64-v8a devices I use QtAVPlayer with mediacodec decoding, but not decoding to GL_TEXTURE_EXTERNAL_OES (by not feeding the AndroidSurface to the Android Hardwarecontext).
Could you confirm that QtMM works there? It also uses GL_TEXTURE_EXTERNAL_OES and if it is smooth, there is a bug in QtAVPlayer and needs to be fixed.
, it was obvious that the rendering of the GL_TEXTURE_EXTERNAL_OES frames was showing the unreliable timing on all devices
Technically QtAVPlayer and QtMM should provide the same performance since they use the same impl. If there is a diff in perf, need to make sure that no bugs in QtAVPlayer, f.e. using DirectConnection for the frames is mandatory. etc
In #273 you explain the process between decoding and rendering clearly:
How about Android with h264_mediacodec?
The frames delivered by
avcodec_receive_frame
seem to be NV12 and seem not to get converted or mapped, when sending them to the videosink they are still NV12 (at least I don't see where mapping or conversion would happen).In armeabi-v7a devices the h264_mediacodec decoding is too slow, already not smooth anymore with HD. Top shows 100% CPU load, for mediacodec??? (On arm64-v8a only around 20% for mediacodec.)
So for arm64-v8a devices the decoding and the process after
avcodec_receive_frame
seems to be very fast, all is playing smoothly, even on older devices.However on armeabi-v7a devices, usually Android boxes, despite having fast CPUs, the transfer of data in and out of mediacodec plus the decoding is ridiculously slow, and the CPU usage (SD 40%, HD 100%) far too high for hardware decoding.
What could be wrong there, or different there?