bytedeco / javacv

Java interface to OpenCV, FFmpeg, and more
Other
7.58k stars 1.59k forks source link

JavaFXFrameConverter is pretty slow #1989

Open thhart opened 1 year ago

thhart commented 1 year ago

I dived shortly into this and detected JavaFXFrameConverter.convert(Frame) to Image is rather slow. Fortunately I found a much faster conversion alternative which uses kind of PixelBuffer of JavaFx. I am no expert in this but it looks like it is an extreme enhancement:

https://github.com/rladstaetter/javacv-webcam/blob/master/src/main/java/net/ladstatt/javacv/fx/WebcamFXController.java

saudet commented 1 year ago

Contributions are welcome! /cc @rladstaetter @johanvos

thhart commented 1 year ago

I did analyze a bit deeper and found out that the bottleneck is not JavaFX itself. The problem is to get data from one world to the other. In my case I process rather large images 3840x2160. The PixelBuffer in JavaFX expects 4 bands and copying all bands can take up to 30 ms in total from JavaCV to JavaFX. This equals to about 1 GB/s. Unfortunately I don't know the underlying data structures and what to expect here, also the matter of copying from GPU memory might be an issue. The PixelBuffer can be backed up with any ByteBuffer, so there might be some better way but this is a bigger project for sure.

saudet commented 1 year ago

Right, if you need to display that as fast as possible, and it's already in GPU memory, you'll to use an API that supports sharing video buffers. JavaFX most likely doesn't support that, and the OpenGL interoperability in CUDA is deprecated, but you can give that a try. It looks like the new way of doing this is with NvSciBuf, which works with Vulkan too: https://developer.nvidia.com/blog/sharing-cuda-resources-through-interoperability-with-nvscibuf-and-nvscisync/

thhart commented 1 year ago

1 GB/s sounds not much speed, but I think there is probably much more transferred when a Mat structure is handled. I am wondering if it would be possible to access the raw data instead to limit any overhead.

saudet commented 1 year ago

Data structures like PixelBuffer and Mat deal with raw data.

thhart commented 1 year ago

I understand this and finally I am copying plain bytes between the Mat and the PixelBuffer, however it takes really long time to have the underlying data of Mat available and it looks like there is going on more than simply copying a byte structure from CUDA.