Improve rendering speed

EugenDueck commented 3 years ago

First of all a great many thanks for this useful project! I'm using this to turn a recorded mid file into an video (thanks to MIDIVisualizer) with audio (thanks to FluidSynth). I'm doing this for a hobby project, one recording a day, and I have this fully automated now (in a headless docker container: https://github.com/EugenDueck/SisypheanMusic), so I just have to hit a button and it will convert to video and audio and merge etc. I'm pretty happy with this setup. The problem I have is that the rendering is so slow it wouldn't be measured in fps, but in spf (I'm exaggerating). So rendering a 1 minute video takes about 12 minutes time, maxing out 7 of my 8 threads (4 CPU cores, 8 threads), the other core being busy with Firefox and stuff.

The command I am using is something like this:

MIDIVisualizer --midi x.mid --size 1920 1080 --preroll 0 --postroll 3 --export x.mp4 --format MPEG4

I'm running this on a 2 year old, relatively high spec Lenovo X1 Carbon "ultra notebook" with an i7-8550U and 16GB RAM and on Ubuntu. Is this the performance I should expect, or is there a way to improve it? I am pretty much clueless when it comes to GPUs (not that I think my notebook has a usable one anyway) and things like OpenGL drivers, so maybe there's room for improvement there...

kosua20 commented 3 years ago

Hello, thank you for using MIDIVisualizer, I'm happy to hear that it has found a place in your workflow!

Regarding performances, exporting usually takes more time than "regular" visualisation because of the video encoding and writing to disk, but on a CPU & GPU older than yours I usually get a 8x slowdown at worst. So a 12x slowdown on a more recent laptop seems weird (except if thermal throttling kicks in really aggressively or the disk I/O is very slow).

I don't know if Docker can expose the GPU to the applications running in the image. I know there are some options to specify to Docker the GPU to use, but I'm not sure if this supports Intel GPUs. If Docker is instead emulating the GPU on the CPU, this could explain the slow timings and the high multi-core usage. One way to test this is to run a MIDIVisualizer export locally and not in Docker, and check if the performances are improved.

EugenDueck commented 3 years ago

Thanks @kosua20 . 8x vs 12x - so it is in fact in the same order of magnitude, which is good to know. And your explanations as to why make sense.

I have an (I would believe) pretty fast SSD. Even so, the total file size after something like 40 minutes of rendering was a mere 532MB - I guess any old HDD could handle this? (I don't know if there are any large temporary writes in the background that I'm not seeing)

So a 12x slowdown on a more recent laptop seems weird (except if thermal throttling kicks in really aggressively or the disk I/O is very slow).

Yesterday I partly ran it while running on battery, and perhaps the power settings are not for highest performance in that case. I will also try docker vs. bare-metal.

If Docker is instead emulating the GPU on the CPU, this could explain the slow timings and the high multi-core usage.

If it is written to be most efficient, I guess it would max out all possible computing units. Not sure though whether use of hardware GPUs means (almost) not using CPUs, so whether it is an either (CPU) or (GPU) thing.

kosua20 commented 3 years ago

Hello again, I've run a few tests, comparing a bare-metal export and an export using the Docker image you are providing, for the same MIDI file, on the same computer.

Local: Export took 36.499s
Docker: Export took 418.362s

During the export in Docker the GPU was completely unused and the CPU at very high usage, while during the local export the GPU was at high usage and the CPU was less used. Maybe there is a way to expose the GPU to Docker, depending on your platform (on a Unix host, it might be possible to passthrough the X device to the Docker image for instance, while on Windows and macOS it might be more complicated). Another option would be to tweak the X server options to maximize rendering speed, but I don't know if such options exist.

EugenDueck commented 3 years ago

After you beat me to it (thanks!) - I finally got around to trying it myself.

local X server: Export took 19.551s.
docker dummy X server: Export took 57.467s.

So for me it's 3x slower on docker - whereas for you it's 11x slower. Wow! That either means your gpu / cpu ratio is much higher than mine, or your test case was different.

For the record:

CPU: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz (400MHz - 4GHz)
GPU: Intel UHD Graphics 620

This was the cmd line I used:

MIDIVisualizer --midi x.mid --size 1920 1080 --preroll 0 --postroll 3 --export x.mp4 --format MPEG4 --color-keyboard 0 1 1 --color-particles 1 0 0 --color-pedal 0 1 0 --color-major 0 0 1 --color-minor 1 0 0

In the meantime I've installed intel-gpu-tools to be able to see what's going on gpu-wise. This is what it looks like when rendering on my local X server.

intel-gpu-top -  328/ 328 MHz;   39% RC6;  0.97 Watts;      232 irqs/s

      IMC reads:     2183 MiB/s
     IMC writes:     1029 MiB/s

          ENGINE      BUSY                                                                                                                                                                          MI_SEMA MI_WAIT
     Render/3D/0   57.53% |████████████████████████████████████████████████████████████████████████████████████████████████                                                                       |      0%      0%
       Blitter/0    0.00% |                                                                                                                                                                       |      0%      0%
         Video/0    0.00% |                                                                                                                                                                       |      0%      0%
  VideoEnhance/0    0.00% |                                                                                                                                                                       |      0%      0%

The GPU / CPU usage difference between local and docker was similar to yours.

Next thing I'll check is if I can use the GPU from docker - either by exporting the X server, or ideally by somehow using a headless x-server that makes use of the GPU. Thanks for the nudge. I will close this issue now, as this is not a MIDIVisualizer problem. But I will add another comment here, once I've figured out how to use the GPU inside docker for MIDIVisualizer - if it is possible.

EugenDueck commented 3 years ago

So using the docker host's X server in the docker container works, alas, in that case the GPU is not being used. Which makes me wonder if the GPU is being used by the X server at all - perhaps it is in the X client, in which case I can probably continue using the dummy xserver.

So I tried docker run --gpus all ..., but got an error:

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

The reason probably being that docker seems to only support NVIDIA GPUs, but mine is an Intel GPU.

kosua20 / MIDIVisualizer

Improve rendering speed #99