I recently got annoyed that HSM sometimes fails on what seem to be very reasonable images. E.g. HSM_fail_86_576_3904_1.8.fits:
The other images I added to the test directory are similar. These were DES Y6 Piff models that were failing HSM for seemingly no good reason. So I decided to dig in and investigate.
I discovered that the problem was with the parameter max_moment_nsig2, which was set to 25.0 with the following comment in the docs:
A parameter for optimizing calculations of adaptive moments by cutting off profiles. This parameter is used to decide how many sigma^2 into the Gaussian adaptive moment to extend the moment calculation, with the weight being defined as 0 beyond this point. i.e., if max_moment_nsig2 is set to 25, then the Gaussian is extended to (r^2/sigma^2)=25, with proper accounting for elliptical geometry. If this parameter is set to some very large number, then the weight is never set to zero and the exponential function is always called. Note: GalSim script devel/modules/test_mom_timing.py was used to choose a value of 25 as being optimal, in that for the cases that were tested, the speedups were typically factors of several, but the results of moments and shear estimation were changed by <10^-5. Not all possible cases were checked, and so for use of this code for unusual cases, we recommend that users check that this value does not affect accuracy, and/or set it to some large value to completely disable this optimization. [default: 25.0]
The problem was that this parameter was tuned to cause an error of less than 1.e-5 in the moments. But the (default) convergence threshold is 1.e-6. So what was happening is that sometimes the iterations would move dx or dy slightly to cause a pixel to get excluded by the max_moment_nsig2 calculation, but that pixel would have a weight of around 3.e-6. So it would discretely change the moments by about this much, which would cause (Cxx/A - Mxx/2) (or one of the other ones) to flip from positive to negative with amplitude just larger than 1.e-6. Then the next iteration would include that pixel again. This would lead to an endless repeating cycle until it hit the limit on n_iter.
So my solution here is to get rid of this value as a separate settable value, and instead calculate from the convergence threshold when the weights are less than 0.1 of this threshold. That way, including or excluding a particular pixel at that radius shouldn't cause changes of more than the threshold value. For all the particular images that I saved, which had been failing, this fixed the problem. (I only included a subset of my full sample in the test suite, but I tested it myself on around 30 such images.)
Happily, this also seems to have fixed the problem that @aaronroodman reported in issue #1132. So we can marked that issue as fixed now, rather than considering it a lost cause.
Finally, I also added a better error message for when the object is really too small for HSM to find a solution and the moment matrix collapses to a point or a line.
I recently got annoyed that HSM sometimes fails on what seem to be very reasonable images. E.g. HSM_fail_86_576_3904_1.8.fits:
The other images I added to the test directory are similar. These were DES Y6 Piff models that were failing HSM for seemingly no good reason. So I decided to dig in and investigate.
I discovered that the problem was with the parameter max_moment_nsig2, which was set to 25.0 with the following comment in the docs:
The problem was that this parameter was tuned to cause an error of less than 1.e-5 in the moments. But the (default) convergence threshold is 1.e-6. So what was happening is that sometimes the iterations would move dx or dy slightly to cause a pixel to get excluded by the max_moment_nsig2 calculation, but that pixel would have a weight of around 3.e-6. So it would discretely change the moments by about this much, which would cause (Cxx/A - Mxx/2) (or one of the other ones) to flip from positive to negative with amplitude just larger than 1.e-6. Then the next iteration would include that pixel again. This would lead to an endless repeating cycle until it hit the limit on n_iter.
So my solution here is to get rid of this value as a separate settable value, and instead calculate from the convergence threshold when the weights are less than 0.1 of this threshold. That way, including or excluding a particular pixel at that radius shouldn't cause changes of more than the threshold value. For all the particular images that I saved, which had been failing, this fixed the problem. (I only included a subset of my full sample in the test suite, but I tested it myself on around 30 such images.)
Happily, this also seems to have fixed the problem that @aaronroodman reported in issue #1132. So we can marked that issue as fixed now, rather than considering it a lost cause.
Finally, I also added a better error message for when the object is really too small for HSM to find a solution and the moment matrix collapses to a point or a line.