terraref / computing-pipeline

Pipeline to Extract Plant Phenotypes from Reference Data
BSD 3-Clause "New" or "Revised" License
23 stars 13 forks source link

Develop Orthophoto to Supplement Stitched Stereo Images #355

Closed smarshall-bmr closed 6 years ago

smarshall-bmr commented 7 years ago

On the call today we talked about some of the unavoidable shortcomings of image stitching and how we might work around these problems in the future.

First some background. You can skip the next 3 paragraphs if you want to get to the point.

Right now our image stitch is based on a 2m "target distance." This works just fine at the beginning of the season when the plants are all roughly the same height so they can all be close to the target distance. (I'm going to type target distance as TD from here on out.) Later in the season as plants start to diverge in height we start running into difficulties with the images not lining up. For example at the end of the sorghum4 season plant heights ranged between 70 and 400cm. To try and get the best images I set the TD at 350cm above the ground, but the tops of plants in the field range from 75% to 240% of this TD.

The practical upshot of this is that when a point is farther away from the stereo pair it will appear to be closer to the same location from the point of view of each stereo camera and vice versa for points closer to the stereo pair. For that matter, a point that is relatively close to one of the cameras could even be completely outside of the field of view of the other camera. The math on this gets pretty complicated in a hurry because the difference in apparent position of a point changes as an inverse square of the distance away (the closer a point gets to the stereo pair the more pronounced the change is) based on the intersection of two spheres.

The fact that pictures are actually a spherical projection causes further issues in stitching when the subject is close because each camera has a more different perspective. Stitching images from aerial photography or even from drones can largely ignore this effect because the distance of translation between images relative to the distance to target is relatively small. Let's give a hypothetical scenario where a drone is taking an image every 5m while 40m above the target. This would mean if one image was taken directly over an object the difference in perspective after translating 5m would be about 7.13 degrees. The gantry system takes an image every 50cm at 200cm from the target (ideally) making this same difference in perspective between adjacent images over 14 degrees.

This is where the idea of an orthophoto comes in. An orthophoto is impossible to take in real life, but acts instead like a blueprint. For example, a photograph of the side of a car is only perpendicular to the car at a single point; a side view blueprint of the same car is perpendicular everywhere. (For the purpose of the math a blueprint is like a photo taken an infinite distance away.) The good news is that Zongyang's 3D models of the field from the stereo system are the most difficult step in making an orthophoto, or to be more correct in this case, an orthomosaic. The 3D model of the field allows a computer to generate an orthogonal projection that would otherwise be impossible to take with a camera because each pixel represents a point projected from exactly perpendicular to the ground. It's a top view blueprint of the field so to speak.

It may seem like a lot of mechanization to take a set of 2D photos and build a 3D pointcloud simply to reduce the data set back to a 2D image but the result is a really good way to compress a lot of data into a very small file. It also subverts using specialized software to view 3D data, which is useful for reviewing data and would be a boon for extractors later on.

The following image is a rasterized image of a pointcloud from the 3D laser system on the gantry. This is a rough raster, but it represents what an orthophoto would look like, albiet an orthophoto would be an RGB image while this is colored by height. The orthophoto could also be height rasterized like this though given that it is coming from a 3D model.

raster_image

Implementing orthomosiacs

ZongyangLi commented 7 years ago

@smarshall-bmr Thanks for the details of Orthophoto, I will try my best to follow your idea of Orthophoto, please point me out if I make any misunderstanding.

  1. The TD(target distance) is really in a big range for late season's scan, it might be impossible to get a well aligned full field map because of the FOV changes a lot from 75% to 240% of this TD.
  2. The orthophoto is actually an rgb image with 'height' or 'depth' information, we may use the TD information from orthomosaic to generate a more better full field rgb map.
  3. It seems not too hard to create an orthomosaic using the method of stereo, I can have a test and see how's it looks like.
  4. The only issue in my mind now is the stereo calibration. The calibration parameter I was using now works for 2016-10-11 to 2016-11-26, stereo image pairs seems different after that, I am not able to create a stereo depth image using the calibration parameter. we need to make sure stereo camera is not changing there related position.
smarshall-bmr commented 7 years ago

@ZongyangLi

  1. That's exactly right.
  2. The orthophoto itself doesn't have any depth information but it is being pulled from a 3D model so you can simultaneously pull a second image that does include depth information. I think there are some image formats that allow for a 4th value to be stored so instead of an RGB image it would be a RGBD where the "D" is a depth value, maybe a format like this would be useful for extractors?
  3. I think there is a way to go from a set of photos to an orthophoto in a single step but a 3D model is always generated at some point. An orthophoto is a rasterized 3D model by definition.
  4. If you try and make an orthophoto you will want to build the cleanest 3D model possible by using a program that aligns images using an algorithm. Some if these algorithms like SIFT don't need any starting positioning data at all. Most of the photogrammetry programs that I'm aware of generate scale accurate models without calibration. In fact, by experience with VisualSFM has shown that allowing the program to run it's own alignments is far more accurate then trying to define positions.
dlebauer commented 7 years ago

Would http://opendronemap.org/ meet this need?

craig-willis commented 6 years ago

Stale issue. No comments in last 2 months. If necessary, create new issues based on this discussion.