DOI-USGS / ISIS3

Integrated Software for Imagers and Spectrometers v3. ISIS3 is a digital image processing software package to manipulate imagery collected by current and past NASA and International planetary missions.
https://isis.astrogeology.usgs.gov
Other
195 stars 166 forks source link

Findfeatures - "Can't invert empty matrix" Errors #4639

Closed lwellerastro closed 1 year ago

lwellerastro commented 2 years ago

ISIS version(s) affected: 5.0.2

Description
When running on a list of overlapping images, if findfeatures has a catastrophic failure with one image in the list, the entire process fails and no network is produced. After running findfeatures on 5800+ images I have 25 images that failed with the same error:

Program = findfeatures
Class   = "PROGRAMMER ERROR"
Code    = 3
Message = "Can't invert empty matrix"
File    = GenericTransform.cpp
Line    = 139

There is nothing in the output log files that indicate which image(s) is causing the problem or what the problem actually is so it takes quite a bit of time and testing to determine what is going on. I have found in the 2 cases I scrutinized that only one image in the input fromlist fails with this error when I run images through findfeatures one at time. So one image in a long list of input images via fromlist is making the entire process fail.

I am running findfeatures on overlapping images which have overlap ratios >5%. For the two cases I looked into closely the overlap ratio is just less than 6%. For the two cases (which happen to overlap each other), the area of overlap is the top of one image and the bottom of the other. The spice is decent and there is plenty of overlap as far as I'm concerned to place points - about 300 lines and across nearly all samples.

I have many images with similar or less overlaps that work just fine so I'm not clear on what the problem is. Maybe those 5% overlaps are different in size and shape and somehow that's better? I don't know because without going through an arduous process for every failure, I simply don't know which overlap in the input fromlist is causing the problem. Findfeatures shouldn't die because it is having an issue with one input image out of many. Can't it just report on the offending image and move on?

I have well over 173,000 images to process and now expect there will be many of these types of failures. Some images may end up in other overlapping image networks because findfeatures has no issues with them, but if one other image in the fromlist is a problem, then no points for anyone. It's not clear to me how many image networks not being generated due to this problem will be ok in the end but it could have quite the impact depending on the total number of these types of all or nothing failures.

How to reproduce
See files under /work/users/lweller/Isis3Tests/FindFeatures/EmptyMatrix/

The findfeatures command can be found in TC2S2B0_01_02980N041E0466_lq13_matcher_ff.cmd (can execute this file or copy and paste onto the command line) and it expects TC2S2B0_01_02980N041E0466_lq13_fromlist_ff.lis as input.

The output files from the failed run are TC2S2B0_01_02980N041E0466_lq13_ff.log print_TC2S2B0_01_02980N041E0466_lq13_ff.prt

The same command will successfully run if the image TC2W2B0_01_05147N056E0466 is removed from the input fromlist.

Command: findfeatures algorithm=fast/brief maxthreads=7 match=/scratch/lweller/Kaguya_TC/Global/Morning/Lev0/TC2S2B0_01_02980N041E0466.lev0.cub fromlist=TC2S2B0_01_02980N041E0466_lq13_fromlist_ff.lis fastgeom=true geomtype=camera maxpoints=10000 epitolerance=3.0 ratio=0.9 hmgtolerance=3.0 filter=sobel networkid=TC2S2B0_01_02980N041E0466_lq13 pointid='TC2S2B0_01_02980N041E0466lq13?????' onet=TC2S2B0_01_02980N041E0466_lq13_ff.net tolist=TC2S2B0_01_02980N041E0466_lq13_cubes_ff.lis tonotmatched=TC2S2B0_01_02980N041E0466_lq13_notmatched_ff.lis description='Create image-image control network' debug=true debuglog=TC2S2B0_01_02980N041E0466_lq13_ff.log -log=print_TC2S2B0_01_02980N041E0466_lq13_ff.prt

Additional context
Although the error message is different, I believe this is related to https://github.com/USGS-Astrogeology/ISIS3/issues/556

KrisBecker commented 2 years ago

Hi @lwellerastro...

I am having the similar issues and am working on a fix. This particular error doesn't have the same signature/behavior I have seen but is probably related to #556.

I assume you are using the FASTGEOM option? If not, then there is likely some other explanation than what I am about to describe.

The problem occurs when constructing a FastGeom transformation matrix between the geometry of the MATCH/query image and one of the FROMLIST/trainer images. When no valid lat/lon correspondence is found, it aborts. However, it could find a really bad (e.g., a collinear point) matrix and fail in a manner caused by a bad transform like you are reporting. You can try to increase the FASTGEOMPOINTS significantly (default=25) and it may help some.

One way to confirm this is to turn off FASTGEOM. You should then switch to a scale and rotation invariant detector such as SIFT or SURF. If it still fails with the same error (no guarantees here), then it's likely some other problem.

My fix is intended to allow any/all images to attempt to be matched - even those without overlap - and not fail. Images with non-corresponding geometries will be detected and eliminated entirely and added to the TONOTMATCHED output file list. I would not recommend just throwing all images in your set at each match attempt as that would significantly increase processing time. You still need to screen for overlaps (which is also mixed bag).

Due to this error, I will ensure the matrix can be inverted as well.

lwellerastro commented 2 years ago

Thanks for the feedback @KrisBecker. I'll edit the main post to include my command line for all to see. As you correctly guessed, I am setting fastgeom=true. I will try some your suggestions and see if anything changes for a couple of my failures (some of which you have suggested via other posts... sorry this sort of trouble shooting hasn't sunk in yet).

lwellerastro commented 2 years ago

@KrisBecker, thanks again for the suggestions. Running a set of problem images through findfeatures setting fastgeom=false (sift/sift) results in No control points found. However when I run setting fastgeom=true fastgeompoints=50 I get a ton of really good points.

I will rerun my failures using that combo and see how that goes, though I don't actually know what fastgeompoints ought to be, so I'll just go with 50. Since I have lots of images to process is it ok to use fastgeompoints=50 on all if necessary, or does it make better sense to go that direction only for failures (which I have yet to determine if this fixes all of my current failures)? I'd have a look at the documentation to better understand the parameter, but I can't get to our site today for some reason.

Even if this works, I still think there is any issue with that fact that the program fails across the board even though it truly only has a problem with one image in the input list. @jessemapel added a new post ( https://github.com/USGS-Astrogeology/ISIS3/issues/561) based on what I reported in #556, but I'm not sure it's the same idea/fix that you explained above.

KrisBecker commented 2 years ago

Using 50 points should be OK. This parameter is essentially used to construct a grid of potential lat/lon mappings between the images. More may be better in most cases but there is a point of diminishing returns when you get a lot of points.

Note these points are used to construct a homography matrix that is applied as a 2D linear perspective warp of trainer images in OpenCV. You can see the implementation in ImageSource class. The minpts parameters is FASTGEOMPOINTS.

lwellerastro commented 2 years ago

Thanks much @KrisBecker. I ran all 25 failures I had through findfeatures using 50 points and they all had a network generated, so this work around really helped. I think I'll have a better chance of remembering this option in the future since I feel I will need to use it quite a bit with the current dataset.

I'm going to see if I can add a test to my cluster submission script for default fastgeompoints failures, and try a rerun using fastgeompoints=50. Since it's a pretty hard/definitive failure I think the return code will be testable, then I can rerun the program and simply point to a second config file with the alternate number of points.

KrisBecker commented 2 years ago

Absolutely!

This is great info. I will be testing this option on some datasets (i.e., comets, asteroids) I have had matching problems with as well. Here are some additional suggestions/settings I have found to be effective in increasing the efficacy of these types of networks:

  1. Increase of RATIO (>= 0.9) will increase the number of identified features, but may increase false positives, which is common in feature matching. pointreg will help cull the worst of them with appropriate parameters/options. This is the Lowe's ratio test and is typically the cause of most points being identified as an outlier.
  2. Increase HMGTOLERANCE and EPITOLERANCE in odd increments. This will expand the outlier tolerance in pixels and retain more points. It will have implications for pointreg as you should also increase your search chip size accordingly.
  3. For some of the most difficult matching, consider using SOBEL or SCHARR filtering in findfeatures.
  4. Use MAXPOINTS effectively. You can also specify this option in the matching algorithm specification as the number of retained feature keypoints is very much impacted by the algorithms used. Hence, behavior can be very algorithm specific. OpenCV algorithms sort the keypoints based upon the detector response value and return the best MAXPOINTS features.
  5. Its probably time to mention that you almost always should refine your findfeature points to subpixel accuracy using area based matching via pointreg. Feature matching is typically accurate to the whole pixel. Using FASTGEOM may result in registrations that are slightly better than whole pixel, which is usually not good enough for least squares bundle adjustments for the purposes of high precision geometry. I have standardized on Tolerance = 0.9 and ChipInterpolator = BiLinearType in the regdef file. Since findfeatures usually creates dense networks, this usually works out well.
  6. Specify appropriate ValidMinimum and ValidPrecent in the redgef file. You will save yourself lots of grief later on as this will help eliminate points in shadowed regions and points close to limbs (findfeatures is really good at finding limb points). It's important to use both these options together. Valid min screens out shadows and off limb data. Use rather high percentages to remove points off the limb. A percentage of 50% won't help you much - 90% or higher provides decent point retention but is dependent upon chip size. I have found this will save you lots of cnetedits removing outliers with high point/measure residuals and iterations of jigsaw because a point will move off the target body and result in crashes.
  7. Apply pointreg after combining all the image-based feature networks using cnetcombinept. Select a good IMAGETOL that coincides with HMGTOLERANCE and EPITOLERANCE values (e.g., max(HMGTOLERANCE,EPITOLERANCE)+0.5). The size (lines/samples) of the search chip should be at least larger than the size of the pattern chip by 2*(IMAGETOL+1).
  8. Before combining image-based feature networks in cnetcombinept, sort the network list (CNETLIST) using an appropriate algorithm consistent with your final image mosaic order. This is critically important in the selection of the reference image and sub pixel refinement in pointreg. I use the MESSENGER MDIS order algorithm with good results.
  9. After pointreg, run cnetcombinept again on the refined network with IMAGETOL=0 and MINMEASURES=3 to eliminate 2-ray points. jigsaw doesn't mind 2-ray points, but you can really compromise your solution, particularly if you have a lot of 2-rays, because it lacks stereo.
  10. Be not afraid to use cnetthinner (Tammy swears by/loves this app). Be sure to use appropriate WEIGHT options to prefer retention of deeper (measures per) points. The default (WEIGHT=0) will simply use the best average correlation coefficient from pointreg or OpenCV algorithms. The product+1 of any positive, non-zero value of WEIGHT and the log of the number of measures/point is then multiplied by the mean of measure correlation coefficients with the objective to increase retention of points with the most measures. I really like dense networks particularly for evaluations against existing shape models to evaluate ground control when solving for radius and creation of (typically low resolution) shape models.
  11. And finally, increasing FASTGEOMPOINTS for problematic images may lead to more successes and less crashes when using the FASTGEOM option.

I notice you are already using many of these suggestions. This exchange has added additional knowledge to this process.

Thank you!

lwellerastro commented 2 years ago

Ooh, so many goodies @KrisBecker ! I'll have to dig into some of these and see how they can help my workflow.

The one thing I have decided to make standard is to run cnetcombinept and pointreg immediately on each and every output findfeatures network to start merging/culling the sometimes overly redundant points. If the data are awesome (fairly good spice and semi-straightforward to register), I might use imagetol=15 (doing so now for Kaguya TC), but I had to use something much smaller for noisy Titan and I could only pixel register (so very noisy and like no features). These are incredibly fast on the cluster and facilitate cnetcombining all image networks after the fact (using a smaller tolerance, like 1/2 the size of the individual nets). Registering the combined points immediately helps when all images/nets are combined because well, those measures are done and don't have to be dealt with again, and at the all image/net combine registration only needs to occur for measures=candidates so there's less to do there. Plus I found with Titan, when I combined at the image net level, then did it again for all images, then registered, some points really crept too far away from the reference making it even harder to register. It's less of a problem for better data, but I found combinept/pointreg at the image net level really helped overall, so I'm using it for Kaguya.

I realize a tolerance of 15 might seem large, but I can register offsets that large for Kaguya no problem and at the image network level it's fast. I also found using a larger tolerance was essential for Galileo Europa in getting deeper stacks as well, but I think I might have used something <10 for that data since registration was dicier due to the resolution differences.

And yes cnetthinner is a big part of the flow, sometimes more than once depending on the images and overall size of the dataset (or if I screw up and forget I've already thinned). I have not had much success with the weight options, but I was trying with Titan, which is not the best dataset for a number of things, though it (Titan) did get me on the cnetcombinept/pointreg each findfeatures output network path due to the large number of images and ridiculous amount of overlaps (easily hundreds of images, often thousands of overlaps for a single image -ewww). Right, it was the abysmally low correlations for Titan that made using weights not a good fit (I think...can't remember since that was a few months ago).

KaguyaTC's success is based upon my leap of faith with Sobel (Scharr was good too, but sobel seemed to have better results). I tested everything on Schrodinger near the first pole then applied it to Malapert which is the pole and was amazed by the results. It would not have been as successful without sobel. It made all the difference.

Use MAXPOINTS effectively Oh yeah, very necessary for KaguyaTC. Findfeatures using fast/brief will find a gazillion points without any cap. And sometimes the program fails with a cryptic message if I don't use a maxpoints. That being said, I found I had to be careful with what I chose because it seemed like if I went from say 50k to 30k, those 20k points all disappeared in the same area. At least it seemed that way. Often my tests would be between just two images, so I had to ultimately run the test using all the overlaps to see that in the end using a lower maxpoints was ok. And to remind myself that all those B images that were matched were going to be the A image against my current A image and I'd likely get points that way as well. It works out somehow. I'll have to think about your last sentence for this bullet - it probably has something to do with these areas that suddenly go blank when I reduce the number of maxpoints.

findfeatures is really good at finding limb points Haha! Not so funny with Titan though! Of course transitioning from limb to space is not a hard cutoff and space is rarely NULL right of the body. Titan was so annoying (still is - I'm not quite done) I ended up running phocube on all the images to get rid of the distracting off body pixels and ran that through findfeatures. I also had the luxury (thankfully) of using intermediate updated pointing from an outside collaborator. I think it would have been less successful without that head start. Thank Tammy for all the awesome ground points - We're (Bonnie and myself) only having to add to the poles and maybe a few more in equi regions. Not an easy thing tying to RADAR!

Someone ought to add all you suggestions to the findfeatures documentation page! I'll probably print out the email so I have them handy since there are a few things I haven't tried yet. Thanks again for sharing!

github-actions[bot] commented 2 years ago

Thank you for your contribution!

Unfortunately, this issue hasn't received much attention lately, so it is labeled as 'stale.'

If no additional action is taken, this issue will be automatically closed in 180 days.

lwellerastro commented 2 years ago

still in the works, I believe

jessemapel commented 2 years ago

@lwellerastro Yes, we're just waiting for Kris to have time to work on this.

github-actions[bot] commented 1 year ago

Thank you for your contribution!

Unfortunately, this issue hasn't received much attention lately, so it is labeled as 'stale.'

If no additional action is taken, this issue will be automatically closed in 180 days.

lwellerastro commented 1 year ago

still active

lwellerastro commented 9 months ago

Hi - Just wanted to document that recent changes to findfeatures available via isis8.1.0-RC1 have fixed this issue. I simply reran my command in the original post (and added the new parameter tonogeom to see if anything was caught there) and it generated a successful network. Whatever improvements were made to deal with geometry managed to keep the offending image included and in the network. Success all around. Thanks so much for everyone's effort in getting findfeatures improvements out there, they make a difference!

KrisBecker commented 9 months ago

@lwellerastro thanks for info.

I think this particular issue was resolved by improvements in the FastGeom image-to-image mapping algorithm (Radial). If the matrix inversion failed there would be a file in the TONOGEOM file. This could be confirmed by saving the FastGeom'ed images (see the SaveRenderedImages keyword). If an image exists for that pair, the image-to-image transform matrix inversion succeeded.

Hopefully the network is improved.