ghostFaceKillah / vslam

[2023] Make robot see 👀 map & localize from vision
MIT License
0 stars 1 forks source link

VSLAM

Intro

Here's an easy-to-understand Visual Simultaneous Localization And Mapping (VSLAM) algorithm.

render

I made a whole youtube video series to explain it.

If you want to quickly get to the meat of the code, go to vslam/frontend.py and read Frontend.track() function - that's what gets called on every iteration to resolve pose from image input.

It works on top of data coming from kinda-easy-to-understand triangle-based scene rendering from scratch.

The VSLAM part of this repo is largely reinterpretation of tutorials presented in an excellent book "Introduction to Visual SLAM: From Theory to Practice". See the associated github repo provided by the authors of the book. They also generously provided pdf of the book itself in a realted repo. This is very awesome, and I am grateful for that.

Main entry point into VSLAM demo is:

pipenv shell
python -m lessons.ex_03_full_frontend

This live-renders the environment. That's a bit too slow to feel real-time (at around 1 fps). It's better to first pre-generate the data by python -m sim.run and the run it from saved data.

Structure

Installing

I recommend using pipenv.

python -m pip install pipenv
pipenv --python 3.10   # this assumes that you have python 3.10 installed
pipenv shell
pip install -r requirements.txt

To briefly summarize, we depend mostly on attrs, numpy, jax (!), opencv-python, pandas and scipy. jax is not critical and could be done away with in favour of numpy, but it keeps things fast.

Rendering

I made a small triangle rendering library to make data for VSLAM.

This way we fully control the data coming into the algorithm and we get to learn the camera equations from the "inverse problem" side. Indeed, rendering is inverse of SLAM in a way. Below command runs the entry point of the interactive rendering code.

pipenv shell
python -m sim.render

Sorry it's too slow to feel smooth to humans! It seems we would need to rewrite in c++ to be proper fast.

Jax doesn't like modifying arrays in place.

render

Here's an older toy render of a cube.

render