Ways/Ideas to make the localization faster

rakshith95 commented 2 years ago

Hello @michalpolic , I'm running a mapping, and then a localization pipeline on the ARI robot, and I could reduce the mapping time significantly by changing the matching method to 'sequential matching'.

However, the localization pipeline (HlocQueryComposer + HlocLocalizer) takes around 27-30s for a single query image, and almost double the first time it's run. Do you have any ideas I can implement to reduce this?
I could reduce the number of image matches from the default 50 to something lower, but I'm not sure how this would affect the accuracy.

michalpolic commented 2 years ago

Hello @rakshith95, thank you for the question. 1) Regarding the mapping, there is a way to speed it up even more. The library provides the node called "MatchingPairSelector" that will select individual pairs (based on the SLAM output converted by IOConvertor to COLMAP format) of images that have to be matched instead of matching the sequence of "k" images. Please, see the example here:

2) The speed of the HlocLocalizer mainly depends on the processing unit, i.e., CPU or GPU. If the SuperGlue runs on GPU, the number of matched images per second is approx. 40 (using RTX 3090). If I run the same evaluation on CPU (not sure the name, but one of the latest models), the speed is approx. 2 fps. The matching of key points takes most of the localization time. The rest of the computations is much faster. Thus, if I compare one single image with 50 images (obtained by image retrieval), it should take about 1-2sec for an image localization.

There is the possibility to match a smaller number of image pairs (retrieved from image retrieval - NetVLAD) which, as you write, decreases the accuracy. We had some experiments regarding the decrease in accuracy in this thesis "https://dspace.cvut.cz/handle/10467/94982"; however, it is in Czech and not well-written. The main message we showed was that you could compensate for the decrease in accuracy by increasing the number of images in the database. The more extensive database slower the localization neglectable.

rakshith95 commented 2 years ago

@michalpolic Thank you for the response!

The MatchingPairSelector sounds like a good option to incorporate into my current pipeline, thank you for the suggestion.

As to the point of CPU vs GPU, I assumed it was running on the GPU as the default, if available, but I could be wrong. Is there any option/flag to ensure that the computation is done on the GPU?

The main message we showed was that you could compensate for the decrease in accuracy by increasing the number of images in the database. The more extensive database slower the localization neglectable. This is interesting. I guess I'll have to run more tests for this.

Thank You!

michalpolic commented 2 years ago

It runs in default on GPU. However, it would help to have all the drivers - libraries - compilators compatible. For example, the RTX 3090 does require CUDA 11.3, which requires driver 470.xx. If you have in singularity the correct CUDA version, but the host diver is too old/new, the evaluation will run on the CPU. The information about the used processing unit is reported in runtime to the terminal.

rakshith95 commented 2 years ago

Got it, thank you! The 'host' in my case is a docker container based off a nvidia/cuda image, so I don't think that should be an issue. I will check the logs for it.

michalpolic / hololens_mapper

Ways/Ideas to make the localization faster #11