Aligning two xy arrays, running into Max iterations exceeded for even the simplest cases

emirkmo commented 3 years ago

Can I align two xy arrays of stars using aa.find_transform? Because when I try following the example, it always gives the error:

astroalign.MaxIterError: Max iterations exceeded while trying to find acceptable transformation.

As it runs out of iterations during RANSAC. Perhaps I am doing something wrong and this is not intended functionality? In that case, how could I go about changing astroalign to accept 2 xy arrays instead of an xy array and an image?

For the simplest demonstration, I'll use the example notebook:

h, w = img_shape = (200, 200)
n_stars = 10
pos_x = np.random.randint(10, w - 10, n_stars)
pos_y = np.random.randint(10, h - 10, n_stars)
fluxes = 200.0 + np.random.rand(n_stars) * 300.0
pos_x _n= pos_x + 5.
img_posxy = np.array([[x, y] for x, y in zip(pos_x, pos_y)], dtype="float64")
img_posxy_new = np.array([[x, y] for x, y in zip(pos_x, pos_y)], dtype="float64")

aa.find_transform(img_posxy,img_posxy_new)

Always gives the MaxIterError, no matter if I reverse them, use different inputs, change the parameters in find_transform, etc. Is it possible to either expose the max_iter as a keyword instead of getting it from the n_invariants to increase this, if this would solve it, or to suggest a workaround?

martinberoiz commented 3 years ago

Hi Emir,

It should work with two xy arrays. I'll take a look at your example this weekend.

emirkmo commented 3 years ago

Hi Martin,

Sorry, but I got the simple example working now. I don't know what the issue was. The reason I went down to this simple version was that I was getting a lot of the same error when trying this on an actual image.

I pre-extracted the stars using sep, then projected the catalog stars onto the image pixel coordinates using the images WCS and Astropy.

I was hoping that by using astroalign, I could match the extracted stars with the catalog stars as basically a way to "check" the WCS of the image, or even correct it. If the asterisms between the catalog and extracted stars can be matched, I can re-calculate a correct WCS in cases where the WCS is slightly off.

The problem seems to come from 3 sources:

The automatic source extraction is liable to make some mistakes. Technically _ransac should deal with this but I am running into the MaxIterError whenever there are more than just a few.
Some of the actual reference stars may not exist in the image due to image quality issues and I don't know this a priori. We do make an effort to only use references that would be expected to be found in the image based on the instrument, filter, and exposure times. So this is a minor issue. Especially when both xy lists have false positives, the RANSAC algorithm in astroalign really doesn't seem to like it.
The WCS projection can break some of the asterisms I guess if it's very wrong? However, I was only trying to use astroalign to correct minor astrometry problems, so I don't think this is the issue.

So I do think there could be a bug. I went through aa.find_transform line by line with 2 arrays to see if I could find any obvious problems with my data, and I noticed that:

min_matches = max(1, min(10, int(n_invariants * MIN_MATCHES_FRACTION)))

Will always give 10 for any array of x,y I have tried, as n_invariants for my arrays of stars (usually about 20-80 in an image), is huge, anywhere between hundreds to hundreds of thousands, depending on whether NUM_NEAREST_NEIGHBORS has been increased (which I tried to see if this would help). But even with this, _ransac seems to fail after reaching maxiter.

Basically, I am only getting a success when I manually make sure that the top 3 stars in source and target are the same, and have very similar x and y coordinates, and set max_control_points to 3, otherwise I almost always have a failure with the above error.

I'll put up a minimal example if I can't find the issue myself.

martinberoiz commented 3 years ago

Hi Emir,

If the error repeats too often when there's clearly a transformation to be found, then I'd say it's a bug.

Astroalign has a few "empirical" parameters to decide a few criteria of the algorithm. This line you refer to:

min_matches = max(1, min(10, int(n_invariants * MIN_MATCHES_FRACTION)))

basically says that despite how many triangles are there, if the proposed transformation fits 10 of them, then the transformation is accepted and returned. If the number of triangles happens to be less than 10, then it will accept it when 80% of them are matched. Those numbers are rather arbitrary, but I figured that matching 10 triangles by pure chance should be very unlikely, so I capped it at 10. In finding the transformation, it's best to deal with the brightest few sources in the field to avoid unnecessarily confusing ransac.

If you send me some data and script that show a problem case I'll take a look. You can send it to my gmail. I'm interested in cases where astroalign falls short.

emirkmo commented 3 years ago

Thanks! I have put up a notebook with a minimal working example that includes two data arrays on my GitHub:

https://github.com/emirkmo/astroalign_example

I experimented further and I do actually manage to find the transform some of the time, but it does seem to "randomly" fail as well most of the time. The details are in the notebook. Let me know if you want more information or an example that fails more consistently. I chose a very "easy" one for now that I expected to work.

Edit: Basically, I need to significantly increase the pixel tolerance and NUM_NEAREST_NEIGHBORS, then I am sometimes able to find a transform in this specific example. Looking at the source code, limiting max_control_points to a low number is just a hack for these specific arrays since the first 4 elements are pre-matched.

I post-cleaned these extractions from sep by forcing a new 2D gaussian fit using astropy, and removing those that were not a gaussian star based on some goodness of fit measure. So there are much fewer false positives. I still don't understand why even in this nice example with relatively few false positives, it's difficult to get a match. And if there are a lot more false positives, I can basically never get a match no matter how much I relax the parameters due to Ransac reaching iteration limit.

martinberoiz commented 3 years ago

Hi, I got your notebook and I could reproduce all of the errors mentioned. I'll go deeper into this subject and come back as soon as I can.

martinberoiz commented 3 years ago

Sorry for the long delay. I did some analysis and found that indeed there aren't that many matches in the invariant space. By eye, I counted around 7, below the minimum of 10 to call success.

I think this may have to do with the several stars without companion that cause a huge amount of "noise".

I will do some more tests and try to find if there's a working subset that works.

I'll keep you posted.

martinberoiz commented 3 years ago

Ok, it seems that too many unmatched sources spoils the transformation. Check the PR that I just created for your repo.

emirkmo commented 3 years ago

Thanks. Makes sense. I couldn't limit the amount of unmatched sources for my use case. Would the best practice for using astroalign be pruning of possible unmatched sources?

Now I also understand why astroalign usually works for me when I align many of the same image, as there are very few unmatched sources.

martinberoiz commented 3 years ago

Yes, I guess that's a weakness of the algorithm. It doesn't take too kindly too many outliers.

quatrope / astroalign

Aligning two xy arrays, running into Max iterations exceeded for even the simplest cases #63