LSSTDESC / skyCatalogs

Create sky catalogs and provide access via API
BSD 3-Clause "New" or "Revised" License
6 stars 4 forks source link

Sort obj_types to make sure ordering is deterministic #62

Closed rmjarvis closed 9 months ago

rmjarvis commented 9 months ago

When writing a test in imSim, I ran into a problem that the order of objects coming out of skyCatalog was not stable. Sometimes stars were first. Sometimes galaxies.

I tracked it down to the function toplevel_only, which uses Python set to build the set of unique object types. All sounds quite reasonable, except that even in Python 3.11, set doesn't guarantee the order of iteration to be anything in particular. Even for exactly equivalent inputs, the result flipped around seemingly at random.

This surprised me, since I knew they made dict ordered starting in Python 3.7. So I didn't expect set to not be at least stable. But apparently so.

Anyway, the PR changes that to sort the final objs_copy, so that the order is always deterministic.

JoanneBogart commented 9 months ago

So I am confused. I believe toplevel_only is called only somewhat further down in the same file, and the result is immediately sorted. I think doing it within the routine as you did is better for maintainability, but I don't understand why it was necessary to get a deterministic result.
If we make this change, I suggest we get rid of the unnecessary sort (around line 412) following the call to toplevel_only

rmjarvis commented 9 months ago

Oh. I didn't notice the sort() call. I was using v1.6.0rc2, which doesn't seem to have it yet. But yes, using the current main also works for me. The way you do it is also fine I think.

rmjarvis commented 9 months ago

Closing this as unnecessary.