centreborelli / s2p

Satellite Stereo Pipeline
GNU Affero General Public License v3.0
199 stars 67 forks source link

SkySat video poor results (due to pointing error being too large) #91

Open tonzowonzo opened 3 years ago

tonzowonzo commented 3 years ago

Hi,

This isn't so much a bug but a feature request, I recently read the paper "Automatic Stockpile Volume Monitoring using Multi-view Stereo from SkySat Imagery" and was wondering if the RPC correction method from this could be eventually added to S2P? It'd be great for not only SkySat but other sensors with less accurate pointing.

Thanks for your time! Tim

oleg-alexandrov commented 2 years ago

What you need is something called "bundle adjustment". It looks at all overlapping images and adjusts the cameras, so that all rays corresponding to the same feature in all images intersect onto the ground at a single point which is the 3D location of the feature.

The ASP package (of which I am a maintainer) supports bundle adjustment (https://stereopipeline.readthedocs.io/en/latest/tools/bundle_adjust.html), and then can do stereo (https://stereopipeline.readthedocs.io/en/latest/examples.html#rpc-camera-models).

In that software the RPC coefficients are not changed directly, rather, rotation + translation matrix corrections are saved separately.

(SkySat data can be a pain to deal with, as they are formed of very many very small images. Other vendors (Pléiades and Maxar) offer instead a few big images, but those likely cost a lot more.)

harshal306 commented 1 year ago

Hi, First of all, thanks for developing this S2P tool.

I want to understand how this bundle_adjustment output helps in generating refined RPC files. As you mentioned (https://stereopipeline.readthedocs.io/en/latest/tools/bundle_adjust.html) gives the output a rotation+translation matrix corrections. I want to know the name of the files in which they are saved and where to utilize these corrections to refine the RPC file.

Thanks in advance.

oleg-alexandrov commented 1 year ago

First, my apologies for having this discussion here, as it is unrelated to S2P. To answer in short, see https://stereopipeline.readthedocs.io/en/latest/tools/bundle_adjust.html#format-of-adjust-files, for the format of our adjustments. But they are to be used internally only, as they depend on what is defined to be the camera center and that is not exposed there. I will fix that at some point.

The preferred ASP approach is to move off RPC and to a pinhole camera, having camera position, rotation, optical center, focal length (and distortion, though this is unnecessary with SkySat). Those are simple numbers and easier to optimize than RPC. I will suggest reading https://stereopipeline.readthedocs.io/en/latest/examples.html#rpc-models and https://stereopipeline.readthedocs.io/en/latest/examples.html#skysat-stereo-and-video-data. (I will also recommend the very latest build where I fixed a bug in converting from RPC to pinhole camera). I must say, as before, Skysat can be tricky to deal with.

Sorry again for taking over this issue page. If you have further questions you can use the mailing list at https://groups.google.com/g/ames-stereo-pipeline-support.

mnhrdt commented 1 year ago

Thanks @oleg-alexandrov, while this may be formally offtopic, your interventions are always very welcome to us!

To add to oleg's excellent answer, you may want to look also at https://github.com/centreborelli/sat-bundleadjust. This will be integrated at some time into mainline s2p. This strategy is different to that of ASP. Instead of approximating the whole RPC by a pinhole camera, we compose the original RPC by a small rotation+translation of the sensor, and the optimizer only changes these corrective rotations. It would be interesting to compare the results of both approaches. I guess that for a small scene both methods will be essentially equivalent, but for a large scene (where the RPC can not be globally well-approximated by a single pinhole model) they may give different results.

oleg-alexandrov commented 1 year ago

I read your paper with great interest.

It will be nice to see your solution integrated with S2P, so that users can easily try it out.

I agree that modifying the RPC coefficients directly is undesirable. There are too many of them.

While it is true that ASP going from RPC to pinhole cameras makes it somewhat harder to use the cameras in other tools, pinhole cameras are the simplest camera possible, they just have a rotation, translation, optical center and focal length, which is standard in computer vision. After bundle adjustment concludes, one can convert cameras back to RPC. Or, S2P can support pinhole cameras, those are like RPC with fewer coefficients. :)

I agree that avoiding bundle adjustment and only later doing alignment of clouds is suboptimal. It will also fail on flatter surfaces.

I agree that using RPC vs pinhole will give similar results for small footprints.

I am a little confused by the statement:

where the RPC can not be globally well-approximated by a single pinhole model

Your paper refits the RPC for every single camera after it is modified. To me, that looks equivalent to using a pinhole camera, and simply adjusting the pinhole camera rotation and translation. (There is a good chance I did not read the paper in enough detail.)

The Skysat camera is a frame camera. The pinhole model (with lens distortion) is the native model, as it precisely models how rays travel from the ground to the sensor. I would guess the Skysat folks start with a pinhole camera, and only later convert to RPC for user convenience.

Or, are you creating a single big RPC camera and fusing the little image footprints into a single image? That can have its own accuracy issues.

I agree that "Bundle adjustment creates a significant number of variables to be estimated when the number of observed points is large.". I did not find that a problem in practice. Bundle adjustment (with Google CERES) is multithreaded, RPC (and pinhole) models are fast to evaluate, and if you have 1000 cameras with 50-400 points per camera, for example, that is still a rather small set up which will converge reliably, but may take a few hours.

As an aside, all these issues can be sidestepped by using a linescan sensor instead of many little frames. But those have different issues, such as jitter, so refinement of the camera trajectory samples and sequence of orientations which compose a single line scan acquisition may still be necessary.

Happy to have these discussions. Good to learn new things.