uzh-rpg / rpg_svo

Semi-direct Visual Odometry
GNU General Public License v3.0
2.1k stars 861 forks source link

SVO algorithm fails to track more than 80 frames on the ICL-NUIM Living Room dataset #87

Open Eliasvan opened 9 years ago

Eliasvan commented 9 years ago

Although there is texture in the "lr kt3" sequence (http://www.doc.ic.ac.uk/~ahanda/VaFRIC/iclnuim.html), SVO fails to track the whole (1240 frames) sequence, it only tracks ~80 frames. I made a video of the results (compared with our simple SLAM system): https://www.youtube.com/watch?v=khSYi-s7mM4 (relevant part starts at 5:17)

However, I'm aware that SVO works best with wide FOV cams and sidewards motion, which this sequence doesn't satisfy well, but I also tested on "lr kt2" (more sidewards motion) and it doesn't work properly either.

I'll retry with more texture, maybe that will solve the problem.

cfo commented 9 years ago

Apart from the texture I think there might be a problem that the focal length in this dataset is negative. i remember that i had some issues with that when i wanted to use this dataset to test sparse image alignment.

Eliasvan commented 9 years ago

The "fy" of the camera is negative, indeed. Is there some math in the SVO algorithm implicitly assuming positive focal length? An alternative might be to mirror + apply a rotation to the scene, however it would be nice if it worked on the unmodified dataset.

Eliasvan commented 9 years ago

I tested with the re-textured version (applied GIMP Retinex filter on every texture) of "lr kt3". To get an idea of how the first frame looks like: http://s23.postimg.org/jm3u3rt0p/scene_000.png

At least one of the problems has been solved: the initial feature count went from 88 to 415.

However, after around frame 130, SVO mentions a drop of features larger than 50, and as a result the relocalizer kicks in. Lowering the "quality_min_fts" parameter solves most problems (at least at first sight):

You might say the last parameter-setting is satisfactory enough, but unfortunately the relocalization seems to introduce a large error. The following video shows a comparison of SVO and groundtruth, after the first 40 frames the initial homography is found and the camera correctly follows groundtruth closely, but starting from frame 105 some features get lost, and at frame 235 the relocalization makes the camera jump to a wrong position. http://s000.tinyupload.com/index.php?file_id=05896109029247287548

cfo commented 9 years ago

@Eliasvan can you check if you get better results when you put a minus before 'dy' on line 140 in sparse_img_align.cop?

Eliasvan commented 9 years ago

Thank you very much for your reply, you hit the nail on the head: after recompiling with your suggested code change, with all SVO config values set to factory default, SVO performed the re-textured "lr_kt3" trajectory without a single lost frame (except before the initial homography of course)!

Here is the AbsoluteTrajectoryError (http://www.tinyupload.org/c6b2ve5k9kw) and here is the RelativePoseError (http://www.tinyupload.org/o62c8nanpdf). Here is a short video of the results: http://www.tinyupload.org/cd2wndgtcvy

I also tested on the "lr_kt2" trajectory, and this one required some minor tuning of the config: I had to set "quality_min_fts" to '43', instead of '50' (factory). After doing so, the results were even better than previous trajectory: AbsoluteTrajectoryError (http://www.tinyupload.org/o43kb5dw6rv) and here is the RelativePoseError http://www.tinyupload.org/tmwsqg6dehz).

To summarize, after applying your fix, the re-texturing was absolutely necessary to make SVO work on the ICL NUIM living_room dataset. SVO held (almost perfect) rigid track of 121 features throughout the whole trajectory ("lr_kt2" and "lr_kt3" tested). I will upload the re-textured dataset, and let you know when it is online, such that you can reproduce the results.

There is one last thing I'm not yet entirely satisfied about yet: the output map seems reasonable, but it seems that outliers are not filtered (as can be seen in the linked video). I'm only preserving the converged seeds of the depthfilter, and add a keyframe to the depthfilter each 10 frames (I didn't find a way to let the FrameHandlerMono object signal this event automatically). You can check my SVO-interface code at https://github.com/Eliasvan/Multiple-Quadrotor-SLAM/blob/master/Work/SLAM/application/SVO/run_pipeline.cpp Could you please provide some feedback on this? (Note that I don't want to make use of ROS.)

Will you apply a fix for this issue? Some things that shouldn't be forgotten: what would happen if the "fx" (instead of "fy") constant is negative, or both?

Once again, thanks for your response!

Eliasvan commented 9 years ago

Hi,

Just to let you know, I've made a video where our simple SLAM system is run on the re-textured dataset. It also includes a brief comparison with SVO. Here is the video: https://www.youtube.com/watch?v=iUcnvrCxY24

Elias

cfo commented 9 years ago

Hi Elias, looks great! thanks, christian