terzakig / gptam

PTAM with OpenCV : This is a deep code modification of the original PTAM by Klein and Murray to work only using OpenCV; Although the initial intention was to do the library adaptation for starters, I ended up having to modfy the code and some algorithms as well.
80 stars 22 forks source link

PTAM with OpenCV (gptam)

This real-time Visual SLAM application is a deep code modification of the brilliant original work by Klein and Murray , "Parallel Tracking and Mapping" (PTAM).

Brief Overview of the Additions / Modifications

Although this certainly looks like PTAM, it is however a major line-to-line hack and despite many similarities (mainly in the interface), it is not the original code in many ways. In fact, what is mostly inherited from the original is the OpenGL based interface. Vision algorithms have been re-written either on the basis of the original or completely from scratch:

  1. Algorithmic Alterations: Many changes were made to the code that implements casual (but certainly non-trivial) SLAM-stuff such as Gauss-Newton optimization, point triangulation, SLAM initialization, etc.

    • For instance, the original PTAM initializes by detecting and decomposing a homography a la Faugeras which recovers 8 homographies, although there are 4 at most; furthermore, this implementation carries over a mistake from Faugeras' paper which confuses arbitrary scale (by-product of SVD) with the actual distance of the plane from the origin along its normal (which cannot be recovered). I implemented a new routine which is still homography based, but the algorithm now returns the four solutions (including the case of only two solutions which basically corresponds to motion along the normal of the plane). In short, the solution is loosely based on Zhang's observation that the middle singular vector of the homography is orthogonal to both the plane normal and the translation. Note that the new decomposition method eliminated need to have the "p" postfixed rotation and translation fields as well as the dubious scale field "d" from the "HomographyDecomposition" structure as they are no longer required for the reasons explained above.

A few more alterations that I can think of are,

In general, there have been many such similar alterations (perhaps improvements) while converting the entire code and I really cant enumerate each and every one here... It should be noted that in the context of these changes, I really HAD to change the names of certain variables and functions because they were simply pointing at the wrong direction. For instance, a name "MakeTemplateSubPix" is a misleading name for a function that simply computes a Jacobian Gram-matrix accumulator (it was renamed to "PrepGNSubPixStep" which - in my opinion - is much closer to what the method actually does). However, I should point-out that most variable names were extremely well chosen and not only I kept them, but I was heavily influenced in adopting the exact same (or similar) naming conventions.

I have also added plenty of comments explaining (matlab style) the actual algorithms behind the code (typically non-linear LS formulations in the Patchfinder.cpp, Calibrator.cpp, SmallBlurryImage.cpp and CalibImage.cpp). Now, my comments dominate the code (excluding the Bundle.cpp file which is sufficientlly commented); in several cases I was forced to erase existing comments because they were either misleading or very vague (perhaps they were added later-on?). So all in all, if you read comments, chances are 3/4 they are mine.

  1. Other changes concern the software engineering side of things: For starters, no TooN, no libCVD and no GVars and Enter openCV:

I should note that with the exception of a Timer class from libCVD and the M-estimator header, every other file sustained major changes in order to accommodate the use of OpenCV structures. So, even if I did not change the specifics of an algorithm I would still have to make changes in order for all code to work exactly or approximately as thge original.

Why OpenCV instead of the original dependencies?

Although TooN, libCVD and GVars are brilliant to say the least, they are not well documented online (except for the API reference of course) and examples simply do not exist. It has been much easier to go through the entire code line-by-line, rather than trying to write a working example just from the API reference; interestingly, even when certain functions had clear use in the original PTAM, I would still have to read the code in TooN or libCVD in order to gain insight on its specifics; such an example is "halfSample" which effectively is averaging decimation (I originally used pyramidal downsampling and (thereafter) simple resizing with interpolation as provided by OpenCV and discovered that Rosten's simple routine works better results for some reason and its just plain faster!) .

Thus, I ended-up lifting and modifying code such as the SE3, SO3, SO2, SE2 (TooN), GUI (GVars), FAST (libCVD) classes and the OpenGL interface to provide missing functionality using OpenCV Matx and Mat_ objects. No need to mention some TooN code was a great mind opener, such as, for instance, the near-zero approximations to the Lie exponential, or the brilliant operator overloading schemes (I created mine, but the essential ideas came straight out of the original code).

Unlike other PTAM spawns (e.g., ETH PTAM or the recent ORBSLAM) I kept the OpenGL interface simply because it's awesome! It took a bit of effort to convert it to work with OpenCV matrices, let alone isolate the useful stuff from the rest of libCVD, but it was worth it! The same goes with GVars which I believe is much more useful than the standard .xml storage.

A few words about the functionlity of libCVD, GVars and TooN in the following source code directories:

a) GCVD : Contains some GCVD functionality (SO2, SO3, SE2, SE3) and the OpenGL interface (which I think is mostly Klein's work). There is an OpenGL window and a few custom developed controls which, in the world of Linux are very-very useful; I am very arrogant myself, but also wise to keep this stuff because they not only save time, but also have good taste in them! And I didnt even bother changing colors, ect. because they are simply nice as they are!

b) FAST : Just the FAST stuff from libCVD with simple modifications for OpenCV Mat_ structures.

c) Persistence : This directory contains code that implements functionality of GVars. I converted the original GV3 gvar3 objects to PV3 and pvar3 to work with OpenCV Mat and Vec<P, int> types (I actually had to do a lot of template "tricks" to somehow entertain the fact that Mat objects dont come with a priori fixed size - all in all, I havent tested persistence with Mat_ objects, but it is unnecessary for now). I kept the GUI class nearly verbatim. I should note that perhaps it would be better to manage events from OpenGL, but it is unimportant.

INSTALLATION

** The necessary dependencies are OpenGL and OpenCV.

** To compile, create a directory buildin the root directory of GPTAM, enter it with cd build and run cmake .. followed by a make.

** I wasn't able to "automate" the OpenCV library settings, so I hard-coded the paths in the CMakeLists.txt which are the usuals: /usr/local/lib and usr/ local/include.

RUNNING PTAM

In the root directory you will find a fiile calibrator_settings.cfg containing the settings of the calibrator, including the initial parameters of the camera. The calibrator will use default settings if the respective file is not found.

To specify which camera to use, specify the corrsponding index (-1 for default and 0, 1, 2, ... for other USB cameras) in the settings file (in both the calibrator and PTAM settings). For example, the following line,

Camera.Index=1 specifies camera #1 to be the device use by the capture object.

Note that PTAM requires the camera intrinsic and distortion parameters to be stored in a file named settings.cfg. You will have to fill this file with the intrinsic parameters provided by the calibrator. After saving the calibrated parameters, they should be stored a file named camera.cfg. Simply rename this file to settings.cfg and you should be able to run gptam with the calibrated camera.

Examples of calibrator_settings.cfg and settings.cfg are now stored in the root directory of the repository.