Any suggestions for optimization possibilities for Raspberry 4b!?

Fredrum commented 2 years ago

Hello!

I am going to use OpenHMD for rotational RiftCV1 tracking in a small project but I would also love to have Positional tracking.

Iv'e managed to get it running on the Rift+RPi before but it took around 150% cpu (out of 4x100% for the cores) and I got some stuttery results.

Does anyone have know the current algorithm and code well enough to be able to advice what I should focus on if I wanted to attempt to speed this up?

Or maybe even just splitting into more threads as maybe it was a saturated core that caused the stutter I have no idea.

Any ideas welcome! Cheers

thaytan commented 2 years ago

The IMU + vision fusion is costly. I'd like to find a better way to do that. In the meantime, make sure you're configuring with meson -Dbuildtype=release

You can also try adding CFLAGS like CFLAGS='-ffast-math -march=native' meson -Dbuildtype=release

Fredrum commented 2 years ago

Ok! So with 'IMU+vision' do you mean the entirety of the kalman-filter process? Or are you aware of specific functions that use much more time than others?

I noticed a lot of matrix math calls what do you think about looking at the Eigen library? It's meant to be fast and might even have some Neon instruction set intrinsics for the rpi's cpu.

thaytan commented 2 years ago

The compiler vectorises the kalman matrix functions quite well when it's allowed to (-march=native). Eigen might do it too, but it's a lot of work with ugly C++ templating just to find out

Fredrum commented 2 years ago

Ok I'll won't pursue that then. :) I saw in your previous issue about this you mentioned the tracking camera image acquisition and maybe initial tracking marker identification? maybe that's an area that it'd be worth looking at?

Fredrum commented 2 years ago

I guess I'd just have to time some parts to find out. No worries this is just a small test project please don't spend any time on my behalf!! :)

thaytan commented 2 years ago

The search for reacquiring LED identities can be expensive. There is probably a better way to do it, but I haven't thought of one yet. It's not highly vectorisable in the current form. Parallelising across other threads might spread the CPU usage, but each camera is already running in a separate thread.

thaytan commented 2 years ago

One option for reducing the cost of the kalman filter would be to do some averaging of the incoming IMU samples. Averaging every 2 samples would still yield a 500Hz update rate, but would nearly halve the cost of the filtering for the HMD. Similarly for the controllers, which have a native 500Hz IMU rate.

clay53 commented 2 years ago

On the topic, would any section of this project benefit from lots of parallelization by the GPU? I haven't really read the source code for this yet so I don't know if it already does if available (I imagine it'd be preferred to leave CPU fallbacks available). I've been learning computer shaders (Rust/WGPU/WGSL but I can adapt to whatever would be preferred - and I'd like to gain perspectives from other tools) and I'd love to help this project.

thaytan commented 2 years ago

OpenHMD doesn't have any GPU acceleration. The tracking is all done in CPU code.

I could imagine some of the matrix operations would be good for a GPU, but there's likely bottlenecks transferring the data to and from the GPU that make it not worthwhile in practice. Another one of those things where you'd have to spend time just to find out whether or not it works at all.

For me, it's more important to focus on improving the quality of the tracking than on optimising.

Fredrum commented 2 years ago

Thank you there's a bunch of ideas for me there. See if I get around to trying anything out. I'm just using my Rift as a stand-in for some imagined AR glasses so that I have some HMD to use while waiting for the future to arrive with some cheap AR. Ha probably not before we have flying cars.

Also probably if you are spending time on this project maybe some inside-out slam tracking could be interesting. I was thinking maybe I could use Linux or Android on an old Quest but I haven't seen anyone doing that natively on the hardware. But I digress.

Thanks for all your work and these suggestions!! Cheers

thaytan commented 2 years ago

I haven't touched this project for a few months, but I'll be back to it soon. I've been busy porting the Rift S driver to Monado, and have working SLAM tracking (and some hand tracking) working there and nearly ready for merging to main.

Fredrum commented 2 years ago

@thaytan That's awesome cool (for us) that you're using all what you've learned doing vr tracking. Maybe I have to look for a cheap RiftS on Ebay now :)

@clay53 I did have a play with trying to do parts of some slam algorithm on the GPU by copying from this awesome project that does things like this in glsl: https://github.com/alemart/speedy-vision

I didn't get that far as things got a bit hard and I'm not good enough of a sci/programmer but also in my case I need a fair bit of GPU power left to do the graphical things to be displayed in the HMD. Something to also consider maybe an utlimate solution would be able to shift some steps between GPU<>CPU depending on where a particular project had more available resource. All projects are going to be different. My considerations are mainly low powered devices like phones or Rpi5 (soon?).

thaytan / OpenHMD

Any suggestions for optimization possibilities for Raspberry 4b!? #46