gstreamer-java / gst1-java-swing

Swing integration for GStreamer and gst1-java-core
GNU Lesser General Public License v3.0
5 stars 1 forks source link

High CPU usage by xorg when rendering video on GstVideoComponent #3

Closed orlovsn closed 2 years ago

orlovsn commented 2 years ago

I'm trying to use SwingPlayer example code on Rockchip RK3399 cpu (nanopi M4 board)
When I use pure gstreamer to play 1080 rtsp stream

gst-launch-1.0 -v playbin uri=rtsp://

I get ~20% CPU usage by gstreamer and ~2% by xorg (so GStreamer has hw accel of both h264 and renderer).
Same uri rendered by playbin on GstVideoComponent leads to 100% CPU usage by xorg and 150% by JVM (while MBean threads profiler shows that max cpu usage is caused by GstBus and it is only ~20%)

I tried forcing rkximagesink as sink:

vc = new GstVideoComponent(new AppSink("rkximagesink"))

but nothing changed

gst-inspect-1.0 rkximagesink
Factory Details:
  Rank                     secondary (128)
  Long-name                Video sink
  Klass                    Sink/Video
  Description              A standard X based videosink
  Author                   Julien Moutte <julien@moutte.net>

Plugin Details:
  Name                     rkximage
  Description              Rockchip X/DRM Video Sink
  Filename                 /usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstrkximage.so
  Version                  1.10.0
  License                  LGPL
  Source module            gst-rockchip
  Binary package           GStreamer Rockchip Plug-ins
  Origin URL               Unknown package origin

GObject
 +----GInitiallyUnowned
       +----GstObject
             +----GstElement
                   +----GstBaseSink
                         +----GstVideoSink
                               +----GstRkXImageSink

Implemented Interfaces:
  GstNavigation
  GstVideoOverlay

Pad Templates:
  SINK template: 'sink'
    Availability: Always
    Capabilities:
      video/x-raw

on SwingPlayer load I get some debug info in console, that I, suppose, is generated by GStreamer:

Jun 25 20:09:09 NanoPi-NEO4 startx[5905]: mpp[5905]: mpp_rt: found ion allocator
Jun 25 20:09:09 NanoPi-NEO4 startx[5905]: mpp[5905]: mpp_rt: found drm allocator
Jun 25 20:09:09 NanoPi-NEO4 startx[5905]: mpp[5905]: mpp_rt: use drm allocator for mpp_service
Jun 25 20:09:09 NanoPi-NEO4 startx[5905]: mpp[5905]: mpp_info: mpp version: 6cc2ef5f author: Herman Chen   2021-09-17 [mpp_list]: Add list_mode and list_move_tail
Jun 25 20:09:09 NanoPi-NEO4 mpp[5905]: mpp_rt: found ion allocator
Jun 25 20:09:09 NanoPi-NEO4 mpp[5905]: mpp_rt: found drm allocator
Jun 25 20:09:09 NanoPi-NEO4 mpp[5905]: mpp_rt: use drm allocator for mpp_service
Jun 25 20:09:09 NanoPi-NEO4 mpp[5905]: mpp_info: mpp version: 6cc2ef5f author: Herman Chen   2021-09-17 [mpp_list]: Add list_mode and list_move_tail
Jun 25 20:09:09 NanoPi-NEO4 startx[5905]: mpp[5905]: hal_h264d_rkv_reg: control info: fmt 7, w 352, h 288
Jun 25 20:09:09 NanoPi-NEO4 startx[5905]: mpp[5905]: mpp_buf_slot: set frame info: w  352 h  288 hor  352 ver  288
Jun 25 20:09:09 NanoPi-NEO4 startx[5905]: mpp[5905]: mpp_dec: setting default w  352 h  288 h_str  352 v_str  288
Jun 25 20:09:09 NanoPi-NEO4 startx[5905]: mpp[5905]: h264d_api: is_avcC=1
Jun 25 20:09:09 NanoPi-NEO4 mpp[5905]: hal_h264d_rkv_reg: control info: fmt 7, w 352, h 288
Jun 25 20:09:09 NanoPi-NEO4 mpp[5905]: mpp_buf_slot: set frame info: w  352 h  288 hor  352 ver  288
Jun 25 20:09:09 NanoPi-NEO4 mpp[5905]: mpp_dec: setting default w  352 h  288 h_str  352 v_str  288
Jun 25 20:09:09 NanoPi-NEO4 mpp[5905]: h264d_api: is_avcC=1

If I understand correctly, problem is that GstVideoComponent uses VolatileImage for screen rendering, that is not accelerated on most ARMs because lack of opengl support (and gles is not supported by openjdk)
Is there any way to accelerate rendering under xorg while not relying on jvm?
I previously used vlcj (java bindings for libvlc) and they use embedding of video layer approach - is it possible with gst1-java?

p.s. same code under windows 11 gets comparable to pure gstreamer resources usage under load (10+ 1080 streams)

ADDED: Tried FXPlayer on same arm under liberica jdk17 - with default SW renderer I get low CPU usage by Xorg (~2-5%) but high CPU usage by JVM (180%+) for single 1080 stream that is displayed @1fps, with Dprism.order=es2 (opengl es renderer) I get whole 1Gb RAM filled in 2 seconds (Growing pool ES2 Vram Pool target to ...)

neilcsmith-net commented 2 years ago

This is kind of expected. Rendering into Swing or JavaFX will likely require colour conversion and bringing the video data onto the CPU / into the JVM, to be uploaded back to the GPU.

You can do embedding of the video surface using VideoOverlay. Ideally with PlayBin, although you can use sinks directly. I still haven't got around to putting up an example of that. If you need help, ask on the mailing list and/or contact me via www.codelerity.com depending on the nature of the project and how much code you can share.

I tried forcing rkximagesink as sink:

vc = new GstVideoComponent(new AppSink("rkximagesink"))

That doesn't do what you think it does! That just names an appsink element rkximagesink.