NeoGeographyToolkit / StereoPipeline

The NASA Ames Stereo Pipeline is a suite of automated geodesy & stereogrammetry tools designed for processing planetary imagery captured from orbiting and landed robotic explorers on other planets.
Apache License 2.0
478 stars 168 forks source link

cropping cameras for bundle_adjust #407

Closed steo85it closed 11 months ago

steo85it commented 11 months ago

Hi! I am running parallel_bundle_adjust on a set of LROC-NAC cameras (.cub + csm .json) along with their map-projected version on a limited target region of 1x1 km. Now, loading the (~200) cameras takes a (comparatively) long time, especially for the statistics and matching step, just because they still cover their original field of view instead of pixels covering the "target region", only.

Would it be possible to "crop" the input camera files to the target region, too, and if yes, how to proceed when knowing the max/min lon/lat of the target region? (for this specific case, matches are computed within that limited region between map-projected images, anyway) Thanks!

steo85it commented 11 months ago

fyi @mkbarker

oleg-alexandrov commented 11 months ago

When interest points are created using the --mapprojected-data option, they are only made within the areas seen in the mapprojected images. So, you can try cropping the mapprojected images and restart bundle adjustment (after wiping all existing matches). I believe the statistics are also done on the mapprojected images, if you have them.

If you already have a lot of matches, but want to keep only those in your area of interest (which is not your goal, but is a related thing), one can use the option: --proj-win in bundle_adjust ( https://stereopipeline.readthedocs.io/en/latest/tools/bundle_adjust.html).

Let me know if this does not address the issue.

On Tue, Aug 1, 2023 at 9:27 AM Stefano Bertone @.***> wrote:

Hi! I am running parallel_bundle_adjust on a set of LROC-NAC cameras (.cub + csm .json) along with their map-projected version on a limited target region of 1x1 km. Now, loading the (~200) cameras takes a (comparatively) long time, especially for the statistics and matching step, just because they still cover their original field of view instead of pixels covering the "target region", only.

Would it be possible to "crop" the input camera files to the target region, too, and if yes, how to proceed when knowing the max/min lon/lat of the target region? (for this specific case, matches are computed within that limited region between map-projected images, anyway)

— Reply to this email directly, view it on GitHub https://github.com/NeoGeographyToolkit/StereoPipeline/issues/407, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKDU3GCIEU2N23VP3NJ7T3XTEVAPANCNFSM6AAAAAA3AAURIU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

steo85it commented 11 months ago

Thanks for the quick reply, Oleg! I see that matches are only computed within the areas seen in the mapprojected images, but my input cameras still cover the full NAC field of view (which is much larger), are still (say) 20M (csm .json), and still take ~30" to load (which is done for all input images in all instances - so even just the "statistics" step of BA is taking ~15 hours for me on 12 cpus). Cropped cameras containing only "relevant pixels" would be ~100K, and would load in 1-2" ... problem is, I don't know how to crop those cameras (either the ISIS .cub or the CSM .json) to my region of interest (other than the ISIS CAMTRIM tool, maybe?).

oleg-alexandrov commented 11 months ago

Now I am a little confused. A camera is just a little .json file, it should not take much time to load. The statistics are computed just once, for cropped images, and cached later. The .cub files are not important if the .match files are already found.

So, sorry for asking you to repeat yourself, but I would like to understand better what you are doing. Let's do this:

Apparently you already have matches, etc. So, you try running, say with 50-200 .cub and .json files the command:

bundle_adjust -o run/run

assuming that the output directory already has all the data, and tell me how long it took to load everything, and where you think is the bottleneck. Do you think loading .json files takes time?

I just did my own experiment, and redoing bundle_adjust with prior match files takes a trivial amount of time, even though it does reload the .json cameras.

Also note that if you already have matches, you don't need to run parallel_bundle_adjust, as it does nothing for you, you can just run bundle_adjust itself, so with a single process.

To answer your question though, functionality for cropping CSM cameras does not exist.

Sorry for the back-and-forth. Likely I am still not getting something obvious.

On Tue, Aug 1, 2023 at 9:48 AM Stefano Bertone @.***> wrote:

Thanks for the quick reply, Oleg! I see that matches are only computed within the areas seen in the mapprojected images, but my input cameras still cover the full NAC field of view (which is much larger), are still (say) 20M (csm .json), and still take ~30" to load (which is done for all input images in all instances). Cropped cameras containing only "relevant pixels" would be ~100K, and would load in 1-2" ... problem is, I don't know how to crop those cameras (either the ISIS .cub or the CSM .json) to my region of interest (other than the ISIS CAMTRIM tool, maybe?).

— Reply to this email directly, view it on GitHub https://github.com/NeoGeographyToolkit/StereoPipeline/issues/407#issuecomment-1660728560, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKDU3EWWP2GOWNSOJSD4HTXTEXOTANCNFSM6AAAAAA3AAURIU . You are receiving this because you commented.Message ID: @.***>

steo85it commented 11 months ago

Trying now, thanks! However, I do not have matches, yet (so I am running through all "3 steps" of BA - not sure if that changes your take on the issue). (also, for disclosure, what I have are output and log files from a previous - much faster - run by "colleagues" that I am trying to "reverse-engineer" with the same set of images/cameras and target region)

oleg-alexandrov commented 11 months ago

OK, you can see how it goes. My best guesses (and please correct them) are: (a) json files load fast, and a repeated run can verify that (b) statistics and interest point matching happen only once and only for cropped mapprojected data so time is proportional to mapprojected image extent.

On Tue, Aug 1, 2023 at 10:18 AM Stefano Bertone @.***> wrote:

Trying now, thanks! However, I do not have matches, yet (so I am running through all "3 steps" of BA - not sure if that changes your take on the issue). (also, for disclosure, what I have are output and log files from a previous - much faster - run by "colleagues" that I am trying to "reverse-engineer" with the same set of images/cameras and target region)

— Reply to this email directly, view it on GitHub https://github.com/NeoGeographyToolkit/StereoPipeline/issues/407#issuecomment-1660771513, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKDU3AEPYBDRXDT72WUNDTXTE27DANCNFSM6AAAAAA3AAURIU . You are receiving this because you commented.Message ID: @.***>

steo85it commented 11 months ago

I am starting to think that the main "issue" here is with parallel_bundle_adjust loading all cameras for each instance (input image) to produce statistics for a single image. When a lot of images are provided (>> # of available cores), that's probably not necessarily worth the "distribution" (the test you suggested is taking, if I extrapolate, ~1h to load the cameras - ~25"/camera - and probably 5 min or less to generate statistics for all images - which is consistent with your guess (b) ).

I am also still unsure how the "example" that I am reverse-engineering got to use such smaller camera files to bundle-adjust the same NAC images...

To answer your question though, functionality for cropping CSM cameras does not exist.

Have they maybe cropped the ISIS cameras before converting them (although I thought that was also not possible, yet)

oleg-alexandrov commented 11 months ago

I am starting to think that the main "issue" here is with parallel_bundle_adjust loading all cameras for each instance (input image) to produce statistics for a single image. When a lot of images are provided (>> # of available cores), that's probably not necessarily worth the "distribution" (the test you suggested is taking, if I extrapolate, ~1h to load the cameras - ~25"/camera - and probably 5 min or less to generate statistics for all images).

This is likely. We did not optimize for that as it did not appear that loading a few hundred cameras at the same time would be a problem, even if multiple processes do that at the same time. It would be quite some work to optimize this, as the camera loading happens in a different part of the code and it does not know yet about what cameras will be needed later.

You can still run a single instance of bundle_adjust after all stats and matches are done, and see just how long it takes to simply load 200 cameras. Then, maybe one can use a smaller value for --processes, since each process uses multiple threads when finding interest point matches, which takes longer than fiding stats.

Anyhow, I will keep this in mind when I revisit large-scale bundle_adjustment with many .json files. I never found this to be a bottleneck, in the grand scheme of things, but that's because maybe such a work takes tens of hours usually and gets done overnight.

Message ID: @.***

.com>

steo85it commented 11 months ago

Yes, I confirm my estimates above (1h for loading, 5 min for the stats). I will manually set a single bundle_adjust with --stop-after-statistics (1h << 15h :) ), then run parallel_bundle_adjust from there to distribute matches computations. That should solve this issue for these experiments, and I will later ask how did they get to work with those smaller cameras, as it can still be helpful in other instances. Thanks for the feedback!

steo85it commented 10 months ago

We finally figured out that I had been using the gen_csm.py script from the manual meant for frame cameras (8.13.1.2) instead of the one for linescan cameras (8.13.2.1). That changes the size of the output .json from ~150KB to ~25MB in the case of an LROC NAC image (hence my "nonsense question").

Clearly my mistake of not reading carefully, but I would suggest:

Thanks!

rbeyer commented 10 months ago

What we should really do is double-check that this functionality is reproduced by the isd_generate program distributed in the ALE distribution, and then switch the documentation to have users just operate that program appropriately.

oleg-alexandrov commented 10 months ago

Yes, we need to test that every single example with the doc can work with the isd_generate program, and ensure that this tool can handle all CSM sensors.

For now I implemented Stefano's suggestions 1 and 2. I did not put the warning as that would need to go in every single section, and hoping clearer names should be enough. Also, the user would notice they do the wrong thing as soon as they do something useful with the data, like stereo or mapproject, as then the results would be total junk if a different sensor class is used.

steo85it commented 10 months ago

Thanks for the prompt update! And ok, I was not aware of that upcoming development: that sounds convenient.

Also, the user would notice they do the wrong thing as soon as they do something useful with the data, like stereo or mapproject, as then the results would be total junk if a different sensor class is used.

I am not sure about this, though: I was indeed using the "wrong thing" (gen_csm_frame.py for LROC NAC) but my mapprojected images looked ok, as well as sfs simulated images and bundle_adjust results (just very slow). (not to say that one should absolutely add that warning - clearer names should be enough - but just to point out this specific case)