caprica / vlcj

Java framework for the vlc media player
http://www.capricasoftware.co.uk/projects/vlcj
1.13k stars 259 forks source link

Significant performance problem if the window is large #1234

Closed Him188 closed 4 months ago

Him188 commented 4 months ago

Environment:

My app is a Compose Multiplatform app. From my best knowledge it is on top of AWT. And draws on the main thread. Compose can measure how many frames it actually drew in the past time. My monitor has a refresh rate of 165Hz.

Check FPS statistics below:

// Playing in a small window size, 1000*800
[1809439736] FPS 163 (62-382)
[1809441741] FPS 158 (60-533)
[1809443747] FPS 161 (25-401)
[1809445752] FPS 164 (79-357)
[1809447756] FPS 164 (78-303)
[1809449761] FPS 164 (78-337)
[1809451766] FPS 164 (74-352)
[1809453773] FPS 159 (40-392)

// Playing in a mostly full-screen window size, 1800*1400
[1809759301] FPS 102 (28-756)
[1809761331] FPS 103 (22-675)
[1809763335] FPS 105 (31-769)
[1809765340] FPS 104 (27-739)

// Full-screen mode, 2160*1440
[1809455777] FPS 83 (5-476)
[1809457778] FPS 75 (8-436)
[1809459779] FPS 89 (30-530)
[1809461781] FPS 86 (28-519)
[1809463781] FPS 73 (8-436)
[1809465783] FPS 89 (26-606)

FPS significantly decrease when the window size gets larger.

Profiling results

small window size, 1000*800, 165FPS

Compose contributes to 58% of the execution time, and vlcj contributes to 39.9%. vlcj look good at this point.

image

fullscreen, 2160*1440, 85FPS (a half)

vlcj contributes to 84% of the time, and hence it took too much time to render one frame and hence we get a low average FPS.

image

caprica commented 4 months ago

CallbackMediaPlayerComponent is basically software rendering. There are full-frame buffer copies going on, for every frame.

Performance may be improved with latest vlcj 5.x snapshot and latest VLC, but you can't get away from it being software rendering instead of native hardware rendering.

caprica commented 4 months ago

The JavaFX renderer has at least one less full-frame buffer copies.

If Compose supports an efficient way to render images from native memory, you can write your own VideoSurface implementation and plug it in to vlcj.

Him188 commented 4 months ago

Compose uses Skia, which is a native library and has hardware acceleration. So theoretically it is possible.

Him188 commented 4 months ago

Thanks for your hint @caprica. I fixed this problem.

It has nothing to do with buffer copying, as copies are actually very fast.

I implemented a custom CallbackMediaComponent which does not use an AWT panel but a Skia native buffer. The buffer is copied from CallbackMediaComponent::image to native on DefaultRenderCallback::onDisplay. This introduces 2-3 full copies. Then Skia native rendering will do on the native buffer.

fun BufferedImage.toComposeImageBitmap(): ImageBitmap {
    // TODO(demin): use toBitmap().asImageBitmap() from skiko, when we fix its performance
    //  (it is 40x slower)

    val bytesPerPixel = 4
    val pixels = ByteArray(width * height * bytesPerPixel)

    var k = 0
    for (y in 0 until height) {
        for (x in 0 until width) {
            val argb = getRGB(x, y)
            val a = (argb shr 24) and 0xff
            val r = (argb shr 16) and 0xff
            val g = (argb shr 8) and 0xff
            val b = (argb shr 0) and 0xff
            pixels[k++] = b.toByte()
            pixels[k++] = g.toByte()
            pixels[k++] = r.toByte()
            pixels[k++] = a.toByte()
        }
    }

    val bitmap = Bitmap()
    bitmap.allocPixels(ImageInfo.makeS32(width, height, ColorAlphaType.UNPREMUL))
    bitmap.installPixels(pixels)
    return bitmap.asComposeImageBitmap()
}

I don't think the compiler or the JIT can automatically vectorize this code, however, this approach runs to a steady 144FPS on my M2 Max Macbook (which was also suffering from low frame rate), with 100% software rendering. It achieves 144FPS with 80% CPU usage while the AWT solution gives 70FPS with 150% CPU. With GPU the CPU usage drops to 65% and is acceptable.

So, the problem comes from the AWT components.

Him188 commented 4 months ago

The CPU usage is more likely to be an issue of Compose. On Windows the app only uses <10% CPU and <10% GPU, and renders 165FPS.

Him188 commented 4 months ago

So for those who are also suffering from performance issues. Consider switing to other platforms than the original AWT.

Although vlcj is already very nice, I think a better future for vlcj is to add ready-to-use support for modern UI frameworks like Compose for Desktop and even Jetpack Compose on Android. On Android a popular solution is ExoPlayer by Google, however, it has very poor support for subtitles. Many of my clients suggest vlc works well on Android (from experience with other vlc wrappers like Dart)

mahozad commented 4 months ago

Hello. this may be related: https://github.com/JetBrains/compose-multiplatform/pull/3336

caprica commented 4 months ago

Interesting, but in that Compose PR it seemingly uses invokeLater to update the video image, and that seems a bit dubious to me.

Him188 commented 4 months ago

Yeah this is real native rendering and which I failed to implement yesterday. @mahozad thanks!

https://github.com/caprica/vlcj/issues/1098#issuecomment-1629328602

Him188 commented 4 months ago

I think we can collect solutions like this and post them somewhere in the vlcj tutorials? This will make people's lives much easier.

caprica commented 4 months ago

I'm a bit unclear on the conclusion here - is the Compose PR posted above the solution that's being advocated here?

I think relying on SwingUtilities#invokeLater to repaint the video frames is a bit "wrong". VLC is sending frames for rendering at the "correct" pace, the Swing rendering thread is disconnected from that. So I'd expect with this solution you probably see occasional micro-stuttering or other annoying glitchy artefacts instead of smooth video.

Happy to be proven wrong of course.

Him188 commented 4 months ago

Maybe we just remove SwingUtilities.invokeLater { and it will work. I did very much similarly without invokeLater, which worked on my machine. Played a 20-minute video seeing no glitch

var composeImage: ImageBitmap by mutableStateOf(ImageBitmap(128, 128))

override fun onDisplay(mediaPlayer: MediaPlayer, buffer: IntArray) {
    image?.let {
        composeImage = it.toComposeImageBitmap()
    }
}

Although I suspect Compose may fail if a mutable state is accessed from a non-main thread. I'm not a pro in this.

Him188 commented 4 months ago

I've test that https://github.com/caprica/vlcj/issues/1098#issuecomment-1629328602 does not work when the window is resized. The invokeLater is also required.

DrewCarlson commented 4 months ago

Something like this should be better than the other versions you've referenced.

class SkiaImageVideoSurface : VideoSurface(VideoSurfaceAdapters.getVideoSurfaceAdapter()) {

    private val videoSurface = SkiaImageCallbackVideoSurface()
    private lateinit var pixmap: Pixmap
    private val skiaImage = mutableStateOf<Image?>(null)

    val image: State<Image?> = skiaImage

    private inner class SkiaImageBufferFormatCallback : BufferFormatCallback {
        private var sourceWidth: Int = 0
        private var sourceHeight: Int = 0

        override fun getBufferFormat(sourceWidth: Int, sourceHeight: Int): BufferFormat {
            this.sourceWidth = sourceWidth
            this.sourceHeight = sourceHeight
            return RV32BufferFormat(sourceWidth, sourceHeight)
        }

        override fun allocatedBuffers(buffers: Array<ByteBuffer>) {
            val buffer = buffers[0]
            val pointer = ByteBufferFactory.getAddress(buffer)
            val imageInfo = ImageInfo.makeN32Premul(sourceWidth, sourceHeight, ColorSpace.sRGB)
            pixmap = Pixmap.make(imageInfo, pointer, sourceWidth * 4)
        }
    }

    private inner class SkiaImageRenderCallback : RenderCallback {
        override fun display(
            mediaPlayer: MediaPlayer,
            nativeBuffers: Array<ByteBuffer>,
            bufferFormat: BufferFormat,
        ) {
            skiaImage.value = Image.makeFromPixmap(pixmap)
        }
    }

    private inner class SkiaImageCallbackVideoSurface : CallbackVideoSurface(
        SkiaImageBufferFormatCallback(),
        SkiaImageRenderCallback(),
        true,
        videoSurfaceAdapter,
    )

    override fun attach(mediaPlayer: MediaPlayer) {
        videoSurface.attach(mediaPlayer)
    }
}

You can draw the skia Image into a canvas with canvas.nativeCanvas:

Canvas(
    Modifier
        .fillMaxSize()
        .background(Color.Black)
) {
    drawIntoCanvas { canvas ->
        canvas.nativeCanvas.drawImage(image, 0, 0)
    }
}