Request for automated Geom tool

allisonpenko commented 7 years ago

Request for a robust automated geometry solving tool!! Ideas and discussion - please post here

jstanleyx commented 7 years ago

How do you automate the identification of which GCP is which? The user has to be involved somehow. At best you have a tool that allows the user to tie a GCP to a pixel where the visual GCP is and then get a solution. Like geomtool in the CIL does.

SRHarrison commented 7 years ago

I imagine a GUI that shows an oblique image (to be specified, e.g. snap, timex, or brightest) representative of the camera FOV, then the user is able to provide a number of concurrent points in XYZ and UV coordinates. It should give the option to read those points from a text file or receive user input by clicking on the figure and entering numberical values for the x,y,z,u,and v values. This will establish the rotation,translation, and extrinsic parameters (at least for the given FOV) betwen the FOV and local coordinate system. This info could be output into the current GCP format used in the UAV toolbox or as an entry in some type of database managed by the user (e.g. as a matlab structure or netcdf).

jstanleyx commented 7 years ago

This is called "geomtool" in the CIL.

KateBrodie commented 7 years ago

I think this issue was also referencing improving the frame by frame geometry solution (@allisonpenko is that true?). That is, instead of using an image intensity threshold to find a centroid of a fixed object to solve the geometry of each frame, to perhaps try some type of image cross-correlation on a subset area around the identifiable object. Or perhaps something more advanced (SfM-like even?) @hokiespurs do you have any experience "stabilizing" imagery from a hovering UAS? Or any ideas on how to improve this portion of the toolbox?

@jstanleyx could geomtool be incorporated into the UAV toolbox to solve the initial geometry? or portions of it? are there parts of geomtool that you think are great? are there parts that need improvement to make it more robust or more user friendly? could it be easily adapted to solve the workflow that @SRHarrison suggests without a connection to the CIL database?

hokiespurs commented 7 years ago

An improvement could maybe be to also allow the points to be clicked in an arbitrary order. The code would then solve for every possible solution for uv to xyz correspondences, and select the least squares solution with the best fit. There are so few control points, that this would add a really small amount of processing time. @jstanleyx, I'm not familiar with "geomtool", is this already implemented?

@KateBrodie, I do not have much experience stabilizing "stationary" UAS. I was emailing with Kilian Vos (not sure his github handle), and he is looking into how the z=0 assumption for reference points affects the solution, so he may have some good thoughts here as well. Here's my thoughts, organized by the assumption being made:

(1) Assume there is neither camera rotation or camera translation (everything is perfect)

As most probably know, (@RobHolman has some good plots to show this), this is not the best assumption for most COTS UAS, but it's surprisingly not that bad... and it sure does make processing easier!
Not much to improve on here, just do photogrammetry on one frame and call it a day.

(2) Assume there is camera rotation, but no camera translation (gimbal movement, gps is excellent)

I think this is the assumption that is being made for the current processing with the centroid detection(correct me if I'm wrong). The assumption, which increases in validity the higher the UAS gets, is that the translation of the UAS is negligible.
I agree, Kate, possibly a windowed image cross correlation (though it would likely have to be rotation invariant), or maybe even an automated keypoint detection and tracking algorithm with SIFT/SURF could be a cool implementation.

(3) Assume there is both camera rotation and camera translation (gimbal movement and uas position drift)

As @KateBrodie alluded to, this would be adding the "motion" part of "Structure From Motion", and would require some more intensive processing. You'd need to solve for the xyz of each keypoint while also solving for the camera pose, with such small shifts in the UAS camera pose, this could get tricky.

Maybe this is two different issues? One for algorithmic improvements to UAS stabilization and which category of assumptions we want to be making behind the scenes, and another for GUI implementation/improvements?

bergsmaE commented 7 years ago

Would be cool to identify the GCP initially and track it as the video/image collection goes. I agree with @jstanleyx that the geomtool can give you this initial pick and it exists already; no need for duplicating work.
One thing to bear in mind, most of the argus products are time-averaged. Meaning that you probably end up with the best fit or representative or average position of the camera position/orientation etc. for individual products like cBathy depth estimations. My best guess is that that's why (1) works relatively well (not so surprising). For other UAV stuff (that I obviously don't know enough about but I am catching up), it might be more important.

Considering the effect of the z=0 assumption, for the typical argus products (especially cBathy) this is not neglectable for a greater ratio between camera height and tidal range (wrote a PhD thesis about that). Because pixels move in the real world domain we often observed a smoothing effect for cBathy. However, I presume you aim to look almost vertically downward and the UAV is typically flying/hovering high enough to diminish the z=0 assumption issue.

hokiespurs commented 7 years ago

@bergsmaE

Sorry I should have more clearly explained the z=0 comment. From my understanding, @RobHolman and @jstanleyx have written some really cool video registration and stabilization code for slightly wobbly uas video. This took the uas processing assumption from (1) to (2), and greatly improved the accuracy of the rectified imagery. A step in this stabilization methodology is to click some regions with bright spots on the beach, identify their centroid, and track them through the unstable frames to augment the surveyed gcps. In tracking these "reference points", it was my understanding that they are assigned a "z=0" as they are not necessarily a surveyed gcp, and with no dsm, z=0 is the next best thing. These "reference points" then essentially become "pseudo gcps", which can be used to calculate the camera pose for each frame. It's a pretty cool technique! @RobHolman @jstanleyx, please correct me if I'm mistaken on the methodology here.

So the point I was trying to make is that this z=0 assumption for reference points is valid only if we also assume there is negligible translation of the camera.

I definitely agree that the uncertainty of the dsm the imagery is projected onto will propagate into the x,y accuracy of the projected orthophoto.

Related to this discussion, @kvos just submitted a pull request with a really nice looking GUI for solving camera geometry #55

RobHolman commented 7 years ago

This discussion has sooo many different aspects that it is worth an overview and the development of some kind of a framework for discussion (rather than just jumping around randomly).

GUIs - In the CIL we use geomtool, a gui that developed over many years, allows many options and assumes our local database. However, at its heart it uses the same nonlinear solution method as discussed and used for the boot camp wherein you supply world and image coordinates for a set of gcps and solve for a selectable set of unknown extrinsic parameters. A GUI is certainly useful but to be helpful needs to be pretty modular (i.e. robust to local implementation details). That also requires agreement of input and output structures. I also view GUIs as a bit like icing on the cake, adding nothing substantial but improving flavor.

GCPs – As discussed, we have both surveyed and virtual GCPs. Sometimes the surveyed GCPs are ephemeral, for example a temporary target on a beach or snapshots of a GPS-equipped jet ski in the water. Virtual GCPs are usually features that are easy to find on images but hard to survey. Their pseudo survey coordinates are inverted for using findXYZ with a previously known geometry and an assumed value for one world coordinate, for example z = 0. If the camera is fixed, the selected value of z is pretty arbitrary since it only defines the direction of a ray but, as Richie points out, if the camera moves (e.g. UAV), errors in z will force errors in subsequent geometries (these will be small for small camera movement as in a parked UAV). The GCP structure is fairly simple in the UAV demo, more complete in the CIL implementation.

Templates – We have looked at length into non-point GCPs, basically sub-windows in an image containing a recognizable pattern like a house or sign that can automatically be recognized given a decent first guess at geometry (for example from the previous frame in a movie). In the UAV toolbox from the boot camp, automatic U-V localization is done through a thresholding and center of mass (and could be trivially and usefully be made to find both bright OR dark features). But we have also done localization by a full template correlation or some other similarity matching algorithm in an auto-geom algorithm. Error estimates are required since we need to objectively flag failure or success. We have not used SIFT-type algorithms because we have mostly developed for fixed camera situations. The idea is to determine the slight changes in viewing angles (and/or position) that maximally co-register the template to some reference location (when the geometry was first solved).

Similarity Measures – Simple methods like correlation for co-registering a current template location to a past (reference) location are sensitive to changes of lighting including changing sun angle and sunny/shady day illumination. There are sophisticated solutions, for example using mutual information, that are robust to illumination. These often come from medical imaging.

Cross-geom – Often seaward-facing cameras have no useful GCPs. We have investigated methods of doing cross-geometries whereby the geometry solution of such a camera is found by matching features from the overlap area of a synchronous image from an adjacent camera whose geometry is known. Since most of the features are just waves or foam, they images must be synchronous.

Database and record keeping – In the CIL we have developed a fairly extensive structure for each geometry solution. Standardization to at least some level will be important and should be established early (I say this having introduced the do-it-yourself version in the UAV toolbox for the boot camp).

I’m sure there are other points. @hokiespurs @jstanleyx @SRHarrison @KateBrodie @allisonpenko @mpalmsten @bergsmaE

mpalmsten commented 7 years ago

I'd like to see the stabilization method in the UAV Toolbox go from the existing "template" version to the "similarity measures" version, maybe using mutual information. For the image stabilization I've done (from a tower based camera affected by heating/cooling cycles), the Mattes mutual information algorithm in the Matlab Image Processing Toolbox has worked pretty well, although I think for the UAV toolbox it would be better to stay way from relying on a Matlab Toolbox.

@hokiespurs @jstanleyx @SRHarrison @KateBrodie @allisonpenko @mpalmsten @bergsmaE @kvos

Coastal-Imaging-Research-Network / UAV-Processing-Toolbox

Request for automated Geom tool #25