smroid / cedar-solve

A fast lost-in-space plate solver for star trackers.
https://tetra3.readthedocs.io/en/latest/
Apache License 2.0
1 stars 0 forks source link

Improved T_solve Performance across RA/Dec Space #5

Closed iain-clark closed 2 weeks ago

iain-clark commented 3 months ago

This is in response to a query from Steven regarding my issue related to Tetra3: Tetra3: https://github.com/esa/tetra3/issues/30

I had questioned the performance I was seeing in terms of solution times T_solve. Steven said he'd like to see similar analysis on this Tetra3 cedar-solve fork. The code is attached. It walks across RA/Dec space and generates centroids for 28.6 degree square patches of sky and applies them to the solve_from_centroids function. I did this using both a new database built from this fork and the Tetra3 default_database from back in November and I also varied the number of centroids used by the solver. It appears most/all the gains are from the improved solver. The results with the new solver are about 8-10x better in terms of T_solve than with Tetra3. The old database started to have a few (8) solution failures at pattern_checking_stars=10 whereas the new database generated with this fork had none at 10. However, the old database had only 600 failures with 6 pattern stars as opposed to 1500 with the new database.

With the new solver, the old database solved about 4x faster than the new one.

The overall solution speed gets better with fewer pattern stars even when the number of solution failures goes up. This is true of both the mean and minimum solution times. The old database shows greater improvement in the worst T_solve result as pattern_checking_stars is reduced. It seems the new database has a lot more potential patterns to check even though it is 80% smaller.

If 85 - 95 % of images solve with just 6 pattern stars but we have to submit say 10 pattern stars to get a certain solution, does this imply that the order that the patterns from the supplied centroids is tested could be improved. Should those patterns that are tested when only 6 centroids are provided always be tested first and then those incremental patterns for 7, 8, 9,10 and 11 centroids be tested in sequence?

With the original Tetra3 solver I was seeing a pathologically long T_solve at index i=3384 which was repeatable across various database versions. This no longer seems to be the worst case index.

Neither database yielded solutions with patterns generated from only 5 pattern stars. Does this imply these patterns should be excluded or is this a problem with database extraction?

Given the results showing areas of RA/Dec space with long solution times, could additional patterns be generated specifically for them?

It might be instructive to do some statistical analysis of the pattern indices if they were reported by the solver.

How is the uniformity of solution density in the database assessed? I assume this is an important metric. What other metrics are of concern?

See code: exercise_tetra3_cedar.tgz

Thanks Iain

iain-clark commented 3 months ago

I think this might be a better order in which to select centroid patterns:

n=8; r=4 pat_list = (np.fliplr(sorted(np.fliplr(list(combinations(list(range(n)), r))).tolist())).tolist()) #use list sorted not np.sort pat_list [[0, 1, 2, 3], [0, 1, 2, 4], [0, 1, 3, 4], [0, 2, 3, 4], [1, 2, 3, 4], [0, 1, 2, 5], [0, 1, 3, 5], [0, 2, 3, 5], [1, 2, 3, 5], [0, 1, 4, 5], [0, 2, 4, 5], [1, 2, 4, 5], [0, 3, 4, 5], [1, 3, 4, 5], [2, 3, 4, 5], [0, 1, 2, 6], [0, 1, 3, 6], [0, 2, 3, 6], [1, 2, 3, 6], [0, 1, 4, 6], [0, 2, 4, 6], [1, 2, 4, 6], [0, 3, 4, 6], [1, 3, 4, 6], [2, 3, 4, 6], [0, 1, 5, 6], [0, 2, 5, 6], [1, 2, 5, 6], [0, 3, 5, 6], [1, 3, 5, 6], [2, 3, 5, 6], [0, 4, 5, 6], [1, 4, 5, 6], [2, 4, 5, 6], [3, 4, 5, 6], [0, 1, 2, 7], [0, 1, 3, 7], [0, 2, 3, 7], [1, 2, 3, 7], [0, 1, 4, 7], [0, 2, 4, 7], [1, 2, 4, 7], [0, 3, 4, 7], [1, 3, 4, 7], [2, 3, 4, 7], [0, 1, 5, 7], [0, 2, 5, 7], [1, 2, 5, 7], [0, 3, 5, 7], [1, 3, 5, 7], [2, 3, 5, 7], [0, 4, 5, 7], [1, 4, 5, 7], [2, 4, 5, 7], [3, 4, 5, 7], [0, 1, 6, 7], [0, 2, 6, 7], [1, 2, 6, 7], [0, 3, 6, 7], [1, 3, 6, 7], [2, 3, 6, 7], [0, 4, 6, 7], [1, 4, 6, 7], [2, 4, 6, 7], [3, 4, 6, 7], [0, 5, 6, 7], [1, 5, 6, 7], [2, 5, 6, 7], [3, 5, 6, 7], [4, 5, 6, 7]]

I think this means all patterns with max indices <= n come before all patterns with max indices < n.

max = r-2 for i in range(len(pat_list)): ... new_max = int(np.max(pat_list[i])) ... if new_max > max+1: ... print('whoops at', i) ... if new_max > max: ... max = new_max ... print('new max at', i) ... new max at 0 new max at 1 new max at 5 new max at 15 new max at 35 bad_list = [[4,0,0,0]] + pat_list + [[21,0,0,0]]

max = r-2 for i in range(len(bad_list)): ... new_max = int(np.max(bad_list[i])) ... if new_max > max+1: ... print('whoops at', i) ... if new_max > max: ... max = new_max ... print('new max at', i) ... whoops at 0 new max at 0 new max at 6 new max at 16 new max at 36 whoops at 71 new max at 71

iain-clark commented 3 months ago

correction: I think this means all patterns with max indices <= n come before all patterns with max indices > n.

smroid commented 3 months ago

Iain,

Thanks for evaluating cedar-solve!

A few notes:

  1. Although there are four combinations (old vs new solver cross old vs new database) I'd like to concentrate on old solver old database vs new solver new database.
  2. In the new solver, the pattern_checking_stars parameter is no longer meaningful.
  3. Your suggested pattern list enumeration code produces identical results as breadth_first_combinations.py, used by the new solver. This is indeed one of the key improvements of the new solver.

Your results where you supply very short star_centroids lists to solve_from_centroids() (e.g. just six centroids) are interesting but not unexpected. In some cases we cannot utilize all of the solve-time centroids, because we disallow stars that are too closely spaced, and can thus fail to solve.

In practice Tetra3 (and the cedar-solve fork thereof) are intended to work with at least ~10 centroids passed to the solver. ~15 or more is better... in real life scenarios there will be some false detections, some missed detections, and a degree of mis-ranking by brightness due to variability and differences in camera spectral response compared to the catalog.

iain-clark commented 2 months ago

I was able to improve tetra3_cedar T_solve times in a couple of ways.

I had been concerned about the sky coverage that generate_database was producing because I was seeing instances where 134 patterns had to be tested in order to get a solution. There were also many locations that would not solve and performance across RA/Dec space and the FOV spread was inconsistent. I modified generate_database to write out the catalogue pattern list and their centroids and plotted the centroids and this supported my impression of the coverage issue. I could see many largish areas with no pattern centroids. Where areas had very slow T_solve I added patterns to the database. I modified generate_database to read in the pattern list with the additional patterns I had generated and saw some improvement in results.

I built my own library in a very slow and inefficient way but it was exhaustive and it gives me pretty good results. Plotting pattern centroids from my new database shows more uniform coverage. This was the major improvement in T_solve performance.

I didn't see an easy way to repeat the original cluster buster with my brute force method so I just deleted patterns where the shortest pattern edge was less than 1/16th the length of the longest edge. This makes cluster definition independent of FOV if we accept that patterns for a particular FOV must have size limits scaled to that FOV. I figured that 1/8th the FOV was a good lower limit for pattern size so that 1/16th was a reasonable cluster buster limit. I think 1/2 the FOV is a good upper limit for pattern size. The 1/16th limit removed about 5% of the patterns so is probably worthwhile. I implemented the same 1/16th cluster buster test in solve_from_centroids which worked well and gave slightly better results than with no cluser buster. The results with the original cluster buster in the solver were sometimes better at lower FOVs but not much different and often slightly worse at higher FOVs. When I deleted the centroid KDTree used for original solver cluster buster, the min and mean T_solves were reduced by about 15% so I think that recommends this cluster buster approach. This was the other improvement in T_solve performance.

The new database has about 1120000 patterns. I checked for very short patterns in the pattern list but only found a few and left them in.

I extracted patterns from FOVs from 4 to 16 degrees. I saw areas where there are less than 6 centroids at low FOVs, say 5 and 6. I noticed that the solver would fail with only 4 or 5 centroids but then solve fine with the same first 4 centroids when 6 centroids were provided. This was due to the probability test which fails with a centroid count lower than 6. Maybe there should be a switch to force a solution in those cases or maybe some code to give up with less than 6 centroids. Proposed code change?: if num_centroids < 6:

if num_centroids < p_size:

        return {'RA': None, 'Dec': None, 'Roll': None, 'FOV': None, 'distortion': None, ... 'status': TOO_FEW}

I was able to get reliable operation for FOVs from 8 to 30 inclusive. I was using Hipparcos main so infer that maybe it does not give enough coverage for reliable operation below FOV = 8. My generate_database process is very slow at low FOV but if I get it sped up I'll try again down below 4 degrees. I think I want to get down to 1/8th the lower FOV limit of the database to get enough solution diversity to ensure resistance to image corruption.

Given that my current database works down to FOV = 8 it could maybe be reduced if FOV = 10 is really what's required but I was happy to know I had some margin.

The new database nearly always solves with only 6 centroids. I think this implies I'm proved and adequate coverage. The pattern generation process I used did not replace the cluster buster patterns that were removed so that should probably be fixed.

I did much of this work using my newseq sequence but checked against operation with the original tetra_cedar sequence and did not see much difference with the new database whereas I had done previously.

I noticed my solve times could be pumped down by iterating each of them. At first I thought this was just due to the edge ratio cache but then I noticed the same effect even after the cache hit rate was 100% so I am guessing it has something to do with my system cache. I am running on an iMac. Do you see this effect on your system?

I think the improved tetra3_cedar pattern sequence is a major improvement over the original Tetra3.

I think cluster buster is a good idea but I have some concerns about the correlation of behavior between the solver and generator. I think the 1/16th approach is simpler and more reliable since both solver and generator can apply it consistently and independently of the stars in any particular FOV.

It looks like there is a bug with the pattern generation in tetra3_cedar because it produces poor T_solve results which I attribute to poor pattern coverage.

I saw this error message a few times but it didn't seem to be reliably repeatable: image_fov 5 Iteration = 0 0.1858830451965332 Iteration = 1000 46.839502811431885 /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/tetra3/tetra3.py:4617: RuntimeWarning: overflow encountered in divide fov2 = largest_edge / image_pattern_largest_edge * fov_estimate / 1000

This script generated the new patterns: exercise_tetra3_cedar_func_get_all_pats.py May take an hour or so torun.

This script exercises the new patterns: exercise_tetra3_cedar_newpats.py Takes a few minutes to run.

you can use this file to generate the database using the attached version otetra3.py: better_pat_list

This file gives an example of my results: exercise_tetra3_cedar_newpats_results_cbuster_irc_db_16_only_excb_overhd_newseq You can see the min and mean T_solves are very close.

This is my hacked version: tetra3_cedar_newseq.py

This is the database I built: cedar_step_irc.npz

This report: tetra3_cedar_report

do_plot.py has a couple of possibly interesting plots. You can get pattern_centroid_table and pattern_list with: t2 = tetra3.Tetra3(load_database=None) t2.generate_database(max_fov= 30.0, min_fov = 10, pattern_max_error=0.002, star_catalog='hip_main', save_as=None)

tar -h -cvzf tetra3_cedar_results.tgz tetra3_cedar_report exercise_tetra3_cedar_func_get_all_pats.py exercise_tetra3_cedar_newpats.py better_pat_list exercise_tetra3_cedar_newpats_results_cbuster_irc_db_16_only_excb_overhd_newseq tetra3_cedar_newseq.py cedar_step_irc.npz do_plots.py

tetra3_cedar_results.tgz

smroid commented 2 months ago

Lots of stuff going on in your latest!

Regarding sky coverage in generate_database(): a couple of weeks ago I revamped cedar-solve's pattern generation approach to enforce uniform sky coverage. I think this addresses your observations about areas of poor coverage. Can you take a look at the latest cedar-solve version and see how its coverage stacks up to your version? (You'll also find that the latest version's generate_database() is noticeably faster.)

Regarding pattern coverage at databases built for smaller FOVs: it will help to switch to the tyc_main catalog, as it goes deeper than hip_main.

For the occasional error message about overflow in the fov2 calculation-- I'll look into it, probably a simple issue.

I'm VERY interested in your alternative approach to cluster-buster. I'll take a look at your code and let you know if I have any questions.

Thanks for digging into this!

iain-clark commented 2 months ago

I see significant improvement with your latest data base extraction and I infer that the coverage issue is substantially addressed. The database I generated manually is still providing generally better T_solve speeds with somewhat lower min, max and mean values and fewer results greater than 2x the mean. I think I still see bright patterns in my database that don't appear in yours. However. the results are much closer than previous and it is not clear to what extent these differences might be due to the 1/16th cluster buster method or sequence I applied to my database.

I used the original solver with and without cluster buster 1/16th and your sequence for the comparison so there are 4 sets of results. It looks like your latest database struggles a little a FOVs 8 and 30.

See comparisons attached.

I saw this result: image_fov 18 fail_count = 0 max T_solve = 73.0687080649659 min T_solve = 0.6736249197274446

but when I retested I got image_fov 18 fail_count = 0 max T_solve = 5.7333329459652305 min T_solve = 0.6735000060871243

So I suspect we have got T_solves down low enough that the way we measure them is too noisy to be really useful. Seems like the number of patterns tested should be part of the metric.

new_cedar_db_vs_irc_handrolled_cedar_seq_cb16_04_28_2024.tgz

iain-clark commented 2 months ago

BTW, today is pinhole photography day.

smroid commented 2 months ago

/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/tetra3/tetra3.py:4617: RuntimeWarning: overflow encountered in divide fov2 = largest_edge / image_pattern_largest_edge * fov_estimate / 1000

I just pushed a fix for this (not verified).

iain-clark commented 2 months ago

I think I was able to improve apparent results by hacking your latest generate_database code.

First, I had added reporting for a count of patterns evaluated by the solver and found that gave a more consistent result than T_solve. I noted that my hand rolled database was giving more consistently better results in terms of the number of patterns evaluated than was apparent from T_solve values.

See files: new_cedar_db_vs_irc_handrolled_w_pat_count/new_cedar_db_vs_irc_handrolled_w_pat_count_deci

I thought it looked like the new database was missing some bright patterns since the evaluated pattern counts were higher. I tweaked generate_database to use all the stars in the hipparcos catalogue, increase the redundancy in the pattern FOV coverage and decrease the pattern FOV step size. I stayed with the hipparcos catalogue for consistency with the database I had previously built by hand. See the updated tetra3 version attached. I note that performance at FOV 30 is improved relative to other FOVs whereas it had been noticeably less efficient peviously.

See files: new_cedar_more_w_pat_count/new_cedar_more_w_pat_count_deci

The tweaked database is probably much too big but I was hoping it would point to some improvements.

tweaked_cedar_db_gen.tgz

smroid commented 2 months ago

Hi, acknowledging your latest message and efforts. It's going to be awhile (several weeks) before I'll be able to look substantively into your findings and proposed changes. I'll ping here when I have done so.

Thanks!

iain-clark commented 1 month ago

It appeared that the performance of the three databases I previously mentioned was dependent on the cluster buster implementation. This is not surprising but was perhaps clouding the results. I replayed the analysis of the three different databases with and without the 1/16th cluster buster. My handrolled database was previously truncated to about 8e5 patterns but is now about 1.1e6.

I have included scripts to plot the T_solve and pattern count data and hopefully allow faster and more complete grasp of the data and maybe the implications. I compared these combinations:

I see "noise" patches develop in the T_solve results in some cases. It's not clear these are directly solver or database dependent but seem to be repeatable.

The artifact at RA/Dec -2/0.5 appears for both solvers and perhaps suggests a dependence on the cluster buster applied to the database.

Below are my notes where I compare results for each database using each solver.

results_loc_good_cedar_new_cb16solve

results_loc_good_cedar_new_origsolve

results_loc_good_cedar_new_more_cb16solve

results_loc_good_cedar_new_more_origsolve

results_loc_good_irc_new_full_cb16solve

results_loc_good_irc_new_full_origsolve

show_plots_results_loc_good_x6b.tgz show_plots_results_loc_good_x6a.tgz

iain-clark commented 1 month ago

Tripped over this again: Iteration = 6000 3.4039859771728516 /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/tetra3/tetra3.py:5328: RuntimeWarning: overflow encountered in divide fov2 = largest_edge / image_pattern_largest_edge * fov_estimate / 1000 Iteration = 7000 3.3858089447021484

Applied patch:

largest_edge = self.pattern_largest_edge[hash_match_inds]

        largest_edge = self.pattern_largest_edge[hash_match_inds].astype(np.float32)
        fov2 = largest_edge / image_pattern_largest_edge * fov_estimate / 1000

Warning resolved.

smroid commented 1 month ago

Excellent, thanks for the confirmation!

iain-clark commented 1 month ago

I extracted and plotted pattern density histograms for each of the three databases perviously mentioned. The salient difference is that my hand rolled database irc_new_full has a higher pattern count floor at fov = 10 and the new_cedar_hip db has counts down near zero while new_cedar_hip_more is much closer to irc_new_full.

It looks like tuning the lower limit of fov range and the field overlap at each fov is all that's required for your latest generate_database to be solid. I am not sure yet how those two trade off but it seems like a lower fov_min value ensures more spatial diversity. I suspect that limiting star magnitude might be limiting coverage at least for Hipparcos main. That is one difference between irc_new_full and new_cedar_hip_more.

I was a little surprised to see increased coverage at the poles for irc_new_full as I expected np.unique to remove a lot of redundant patterns but apparently I was actually getting additional unique patterns into the db when I had the small RA step sizes near the poles. I had already had a go at changing the step sizes near the poles but the fibonacci lattice sees to be the way to go and I'll see about using that instead.

BTW, the plots are using the Hammer projection if it matters.

pat_hist.tgz

smroid commented 1 month ago

I'm back, will start looking at your work in more detail.

May I incorporate (with changes) your test program into the cedar-solve repo? If so, please let me know:

iain-clark commented 1 month ago

Welcome back.

You are free to use my work as you see fit. Copyright/Licensing - I am inclined toward free to use for everyone but will go with whatever you're doing. Credit - Sure, if you use enough to matter please mention my name and maybe the scope of contribution.

What led me down this rabbit hole was the inability to assess the quality of my results so I think that a test suite makes sense as part of the release package. Let me know if there is any particular work you'd like me to do there. I think the obvious first step would be to apply the Fibonacci lattice to the existing tests as I said. Other thoughts I'd had were to include sensitivity to errors in the centroid locations and brightness order.

There seems to be a consistent -1% error in the calculated FOV compared to what I thought I was doing. It may be a result of how I produce the centroids lists I use for testing so maybe keep that in mind.

Early on I had done testing with png images instead of centroid lists and it seemed there was distortion in my images when I used return_visual. I may go back and revisit that. I would add return_visual to solve_from_centroids as a first step. But, is my method for producing centroid lists introducing distortion? I think I am using the pinhole transform correctly and that it should introduce no distortion. Please let me know if you see any obvious errors.

Thanks

smroid commented 1 month ago

I'm creating a FOV generator/tester along the lines of your ideas.

Related to your recent findings, I've updated cedar-solve to use much larger values for lattice_field_oversampling (was 20 by default, is now 100 by default and I use 1000 when generating my DBs). This yields fuller spatial coverage of patterns.

Relevant comment in new version:

# Making lattice_field_oversampling larger yields more patterns, with diminishing
# returns.
# value   fraction of patterns found compared to lattice_field_oversampling=100000
# 100     0.61
# 1000    0.86
# 10000   0.96

Regarding cluster-buster approaches, can you summarize your idea(s) and what you found when investigating them? I'm currently of the belief that the currently implemented cluster buster approach does not have any major drawbacks, but if that's not correct I'm open to other ideas.

Thanks! -Steven

smroid commented 4 weeks ago

I've added test_sky_fovs.py, which uses elements of the tester you developed. Thanks for getting this going, it is very useful!

BTW, I'm not showing a 1% FOV error, is more like 0.05%

iain-clark commented 3 weeks ago

I don't think there's any real difference in the effect of my cluster buster approach. I picked 1/16th which is slightly different than what you used but not necessarily better or correct. I applied the 1/16th in the solver on each pattern as it was calculated so maybe that's more "pythonic". If the image solves early then this approach saves about 15% in T_solve time since the overhead of the KDTree is not incurred. I saw a minor difference in the max number of patterns required to solve an image with CB16 but I can't really attribute that to how CB was implemented. It's more likely due to the difference in the exact CB values used.

smroid commented 2 weeks ago

I'm going to keep the cluster-buster as is, since it seems OK and it is currently in use by PiFinder with good results.

I'll close out this bug. Thanks for the great investigations of this issue!