Open Fredrum opened 2 years ago
The IMU + vision fusion is costly. I'd like to find a better way to do that. In the meantime, make sure you're configuring with meson -Dbuildtype=release
You can also try adding CFLAGS like CFLAGS='-ffast-math -march=native' meson -Dbuildtype=release
Ok! So with 'IMU+vision' do you mean the entirety of the kalman-filter process? Or are you aware of specific functions that use much more time than others?
I noticed a lot of matrix math calls what do you think about looking at the Eigen library? It's meant to be fast and might even have some Neon instruction set intrinsics for the rpi's cpu.
The compiler vectorises the kalman matrix functions quite well when it's allowed to (-march=native
). Eigen might do it too, but it's a lot of work with ugly C++ templating just to find out
Ok I'll won't pursue that then. :) I saw in your previous issue about this you mentioned the tracking camera image acquisition and maybe initial tracking marker identification? maybe that's an area that it'd be worth looking at?
I guess I'd just have to time some parts to find out. No worries this is just a small test project please don't spend any time on my behalf!! :)
The search for reacquiring LED identities can be expensive. There is probably a better way to do it, but I haven't thought of one yet. It's not highly vectorisable in the current form. Parallelising across other threads might spread the CPU usage, but each camera is already running in a separate thread.
One option for reducing the cost of the kalman filter would be to do some averaging of the incoming IMU samples. Averaging every 2 samples would still yield a 500Hz update rate, but would nearly halve the cost of the filtering for the HMD. Similarly for the controllers, which have a native 500Hz IMU rate.
On the topic, would any section of this project benefit from lots of parallelization by the GPU? I haven't really read the source code for this yet so I don't know if it already does if available (I imagine it'd be preferred to leave CPU fallbacks available). I've been learning computer shaders (Rust/WGPU/WGSL but I can adapt to whatever would be preferred - and I'd like to gain perspectives from other tools) and I'd love to help this project.
OpenHMD doesn't have any GPU acceleration. The tracking is all done in CPU code.
I could imagine some of the matrix operations would be good for a GPU, but there's likely bottlenecks transferring the data to and from the GPU that make it not worthwhile in practice. Another one of those things where you'd have to spend time just to find out whether or not it works at all.
For me, it's more important to focus on improving the quality of the tracking than on optimising.
Thank you there's a bunch of ideas for me there. See if I get around to trying anything out. I'm just using my Rift as a stand-in for some imagined AR glasses so that I have some HMD to use while waiting for the future to arrive with some cheap AR. Ha probably not before we have flying cars.
Also probably if you are spending time on this project maybe some inside-out slam tracking could be interesting. I was thinking maybe I could use Linux or Android on an old Quest but I haven't seen anyone doing that natively on the hardware. But I digress.
Thanks for all your work and these suggestions!! Cheers
I haven't touched this project for a few months, but I'll be back to it soon. I've been busy porting the Rift S driver to Monado, and have working SLAM tracking (and some hand tracking) working there and nearly ready for merging to main.
@thaytan That's awesome cool (for us) that you're using all what you've learned doing vr tracking. Maybe I have to look for a cheap RiftS on Ebay now :)
@clay53 I did have a play with trying to do parts of some slam algorithm on the GPU by copying from this awesome project that does things like this in glsl: https://github.com/alemart/speedy-vision
I didn't get that far as things got a bit hard and I'm not good enough of a sci/programmer but also in my case I need a fair bit of GPU power left to do the graphical things to be displayed in the HMD. Something to also consider maybe an utlimate solution would be able to shift some steps between GPU<>CPU depending on where a particular project had more available resource. All projects are going to be different. My considerations are mainly low powered devices like phones or Rpi5 (soon?).
Hello!
I am going to use OpenHMD for rotational RiftCV1 tracking in a small project but I would also love to have Positional tracking.
Iv'e managed to get it running on the Rift+RPi before but it took around 150% cpu (out of 4x100% for the cores) and I got some stuttery results.
Does anyone have know the current algorithm and code well enough to be able to advice what I should focus on if I wanted to attempt to speed this up?
Or maybe even just splitting into more threads as maybe it was a saturated core that caused the stutter I have no idea.
Any ideas welcome! Cheers