thomasfermi / Algorithms-for-Automated-Driving

Each chapter of this (mini-)book guides you in programming one important software component for automated driving.
https://thomasfermi.github.io/Algorithms-for-Automated-Driving/Introduction/intro.html
Other
391 stars 84 forks source link

Add chapter on camera calibration #3

Closed thomasfermi closed 3 years ago

thomasfermi commented 3 years ago

This issue tracks progress on the (currently non-existing) chapter on camera calibration.

The scope of that chapter is described in the book under future chapters:

How do we estimate the camera height above the road, as well as the camera roll, pitch, and yaw angle? In the chapter on Lane Detection, we got these parameters directly from the simulation. Of course, we cannot do this in the real world. In this chapter we will implement a camera calibration module to estimate the camera extrinsics.

My current projection is that I might start working on this in mid 2021. If someone else would be interested to help or write this chapter on his/her own, that would be super awesome.

Current ideas/references:

I found that people do this in two steps.

  1. Find vanishing point (detecting lines in a single image, or via optical flow from multiple images)
  2. Find roll,pitch,yaw from vanishing point.

References:

thomasfermi commented 3 years ago

I just checked how they do it in openpilot. They check for instances when the vehicle is going "straight and fast". Then they grab the translation vector they get from visual odometry. The reason: The vehicle forwards axis is more or less identical to the direction of the translation vector, since the vehicle is driving straight. But having the vehicle forwards axis with respect to the camera reference frame means that you can estimate how the optical axis (the z-axis) of the camera is tilted with respect to the vehicle forwards direction. Hence you get the extrinsic rotation matrix!

Lines 151-165 in calibrationd.py of openpilot

The variable rpy they have in their code stands for roll, pitch, yaw. The method they have does not allow to find all 3, so they assume that roll is zero (at least that's my impression).

I think that this is a nice and simple approach, and I would like to try it with Carla. openpilot does visual odometry with a neural net, but for the book it might be nicer to do the traditional feature/keypoint based approach. One could use something like this:

blog post: http://avisingh599.github.io/vision/monocular-vo/

python implementation of that blog post: https://github.com/yoshimasa1700/mono_vo_python/

The problem with monocular visual odometry is scale and the blog post solves it by using ground truth data (a little bit of cheating there). But if you check Lines 151-165 in calibrationd.py of openpilot you can see that for our purpose the scale does not matter since it cancels out in the divisions.

If code along those lines works, it could be the building block of the camera calibration chapter.

The algorithm is the heart of the chapter, but one would also need to explain some theory. A great reference about visual odometry would be David Scaramuzza's lectures. They even have cool exercises (unfortunately with Matlab).

thomasfermi commented 3 years ago

Update: There is some ongoing work regarding this chapter, which is being discussed on the book's discord server