ElHacker / PTAM

Parallel Tracking and Mapping (PTAM) is a camera tracking system for Augmented Reality. It requires no markers, pre-made maps, known templates, or inertial sensors.
7 stars 3 forks source link

Framework for an Augmented Reality Application

This work was implemented as project for Stanford class CS231A: Computer Vision, From 3D Reconstruction to Recognition, taken in Winter 2018. In this work we attempt to implement a small scale real time detection, mapping and tracking framework. We take real time video feed as input. On the first frame we do keypoint detection and evaluate descriptors for the keypoints. Using keypoint matching we track these points in the subsequent frames. New points are added as they are detected in the frame. Such tracking and mapping is useful for augmented reality applications. We also show basic image augmentation with a virtual object. More details can be found in the final project report.

Demo video

Tracking and mapping keypoints across frames

Augmented Video Demo

Code File Structure

Takes care of camera calibration, image keypoint detection and matching, OpenGL rendering of the processed image as a background texture, OBJ loading and rendering on the image.

Reads images from the camera on ARCameraFragment.java and then passes it to an interface of ARCameraImageProcessor which is implemented on Python with help of Pyjnius.

Offers an implementation of ARCameraImageProcessor which receives each frame and then transforms it into a Numpy array. This file also takes care of the Android app single view lifecycle.

It supports live video from a web camera and do the full processing there. To run it use command:

python render_model.py --obj CartoonRocket.obj

If you want to store it on 10 seconds video file append the --video option flag

python render_model.py --obj CartoonRocket.obj --video

Python implementation of keypoint detection and matching

Some relevant files for feature extraction and key point detection are:

ORB.ipynb ipython notebook can be used to follow the step by step output of the ORB keypoint feature detection and matching algorithm implementation. ORB stands for Oriented FAST and rotated BRIEF. Notebook reads the images stored in folder ./data and generates 3 transformed corresponding images by applying normalization (1st image), rotation (2nd image), affine transformation followed by warping (3rd image). Original image and any of the three corresponding generated images are used for testing of for key point detection and matching. This notebook uses the following:

Running the pipeline for keypoint detection and matching for images

Running the pipeline for keypoint detection and matching across frames of a video