skyfielders / python-skyfield

Elegant astronomy for Python
MIT License
1.44k stars 214 forks source link

Improvements to `keplerlib.py` and `elementslib.py` #481

Open JoshPaterson opened 4 years ago

JoshPaterson commented 4 years ago

I'd like to start a discussion of what things in keplerlib.py and elementslib.py can be improved, and also putting them in priority order so I can work on the more important things first. Here are some ideas I've had:

brandon-rhodes commented 4 years ago

I'd like to start a discussion of what things in keplerlib.py and elementslib.py can be improved…

Thanks for making some suggestions! This was a busy semester and, obviously, I didn't get around to sitting down and figuring out a list of priorities. But your list does a good job of getting ideas started, so:

  • Should all of the math functions in elementslib have one code path that can handle floats or arrays instead of one path for arrays and one path for floats? Possibly a decorator could convert floats into arrays and then back to floats so only the array branch would be necessary

Typically I have been able to write a single code path to support both, thanks to how NumPy lets math operators be agnostic about whether they work on scalars or strings. There are skyfield.functions to make things like matrix work possible without checking whether, say, time is a vector or not.

If you would like to share a sample math operation you can't see a way to make array-neutral, I can take a look and see whether it's a pattern I've been able to surmount before (or not!).

  • Should the KeplerOrbit and OsculatingElements objects use au^3/d^2 to store GM internally to match the rest of Skyfield instead of km^3/s^2? I think the only place this would effect users is that OsculatingElements has a keyword argument mu_km_s. How should this be handled?

Check out my most recent commits to the module for official guidance, but the way I remember veering last time was to accept only km³/s² as an argument because it seemed to be the units used universally by everyone, but then to go ahead and convert it and save it secretly in au+day units to save having to do conversions every time a position is asked for. See if that matches what I actually did most recently. :slightly_smiling_face:

  • Should the user be expected to specify a value for GM whenever it is necessary rather than saving values internally? If so how should this be handled for backwards compatibility?

Good question! Yes: let’s take mpcorb_orbit() as the current gold standard for how the interface should work. The two competing alternatives would be to always use the current value of GM even in future versions of Skyfield; or to update it every half-decade or so when an even better number comes out, and suddenly Skyfield programs would give different answers; and I don't like either of those alternatives as much as I like having everyone's programs openly declare through a little import statement which GM they are using for that program.

  • I need to refamiliarize myself with what is happening in PR #340 so it can be either merged or closed

Feel free to close it if a blank slate would be easier to start with!

  • After the signature for KeplerOrbit objects is finalized it can be specified in the API reference section.

I would like to move to parameter names that are full words, with units attached, as you can see in the _from_periapsis() method. It also felt like there were a lot of little methods that were nearly identical, and only differed in which way they were given the size of the orbit. It felt like they could usefully collapse and be simplified somehow.

  • revise the documentation for osculating orbital elements so it better matches the rest of the documentation both in tone and in the formatting of the examples

It seems strange that the object that represents orbital elements isn't involved in the construction of MPC orbits. It feels like instead of a huge block of a dozen parameters being the input to _from_periapsis(), it could just be an orbital element array object instead? It felt like having a single standard lean way to represent orbital elements might also provide the platform where we could accept sizes different ways, without the Kepler class needing so many constructor methods.

But I haven't had the time to really dig in to the differences between the Kepler scheme and the Elements library scheme, so since all I really needed were comet and asteroid positions, I hooked things up as simply as I could, even though it means I'm still speaking of orbits using huge sprawling parameter lists.

So, there are some ideas! Are they any help to getting started? Feel free to ask more!

Bernmeister commented 4 years ago

I'd like to start a discussion of what things in keplerlib.py and elementslib.py can be improved

Is this anything to do with comets and minor planets? If so, I'm putting together a test script to load/compute a comet (and minor planet) position (and then calculate apparent magnitude using the H,G model) and compare that with PyEphem. I've found Skyfield to be really slow when computing comets (and minor planets) and was wondering if some speed up was feasible. If this is unrelated, apologies, I'll finish off my test and then open a fresh ticket.

brandon-rhodes commented 4 years ago

@Bernmeister — If you have a short script that runs slowly for you, feel free to share if you would like us to take a look. Generally Skyfield puts usability and correctness first, but performance is a very serious third place behind those two, and if something seems unusually slow I can take a look.

Bernmeister commented 4 years ago

@brandon-rhodes Am crafting said script (it is similar to that in #449 but taking into account your suggestions with regards to dataframe loading). I'll post the script in a new ticket so this doesn't get muddied.

JoshPaterson commented 4 years ago

Thanks for the feedback! That definitely gives me plenty to work on.

If you would like to share a sample math operation you can't see a way to make array-neutral, I can take a look and see whether it's a pattern I've been able to surmount before (or not!).

One good representative example is here: https://github.com/skyfielders/python-skyfield/blob/23c1691c79e29b981f66bc0d69c5f25ae8833386/skyfield/elementslib.py#L256

The scalar branch uses an if statement so that different equations are used for elliptical, parabolic, and hyperbolic orbits. The vector branch uses boolean indexing to accomplish the same thing (which causes errors for scalars).

Check out my most recent commits to the module for official guidance, but the way I remember veering last time was to accept only km³/s² as an argument because it seemed to be the units used universally by everyone

I like the argument name you chose (gm_km3_s2), I will change other places to match that!

It seems strange that the object that represents orbital elements isn't involved in the construction of MPC orbits. It feels like instead of a huge block of a dozen parameters being the input to _from_periapsis(), it could just be an orbital element array object instead? It felt like having a single standard lean way to represent orbital elements might also provide the platform where we could accept sizes different ways, without the Kepler class needing so many constructor methods.

That's a good point that OsculatingElements objects and KeplerOrbit objects are both fundamentally representing the same thing. Ideally there would be a way to combine them so when a user wants osculating elements they get back a KeplerOrbit object that has attributes for all of the elements, but also behaves like a VectorFunction object too. One current obstacle to this is that the OsculatingElements object represents a vector of orbits, while a KeplerOrbit object represents a single orbit. I do think it would be possible to vectorize the function keplerlib.propagate so that it could propagate positions for multiple objects to multiple times, but this probably shouldn't happen without the rest of Skyfield being vectorized in a similar way. I'll have to think about if there's at least a way to have them work together better in the near term.

The unwieldy initialization methods for KeplerOrbits are a problem. I'll think about if there is some way to have one unwieldy initialization method instead of multiple overlapping methods. I'm not sure initializing KeplerOrbits with OsculatingElements objects would help much because OsculatingElements objects are currently only initialized with state vectors, so that would just entail moving the unwieldy initialization methods rather than deleting them; the conversion from elements to vectors needs to exist somewhere.

brandon-rhodes commented 4 years ago

One good representative example is here:

https://github.com/skyfielders/python-skyfield/blob/23c1691c79e29b981f66bc0d69c5f25ae8833386/skyfield/elementslib.py#L256

The scalar branch uses an if statement so that different equations are used for elliptical, parabolic, and hyperbolic orbits. The vector branch uses boolean indexing to accomplish the same thing (which causes errors for scalars).

Ah. Yes, if statements are the tricky ones. Drat. So, if you search the codebase for getattr(..., 'shape', ()) and things like getattr(..., 'ndim', 0), you will see the similar places where I have had to take different approaches for the two cases.

Given that example, I think it might indeed make sense to always keep the elements as arrays inside the routines, but store an is_scalar Boolean in the instance that reminds you to take off the array when returning positions and velocities for what wasn’t originally an array.

That's a good point that OsculatingElements objects and KeplerOrbit objects are both fundamentally representing the same thing… I'm not sure initializing KeplerOrbits with OsculatingElements objects would help much because OsculatingElements objects are currently only initialized with state vectors…

If we could fix how OsculatingElements are initialized, would a system like the following have any contradiction?

So if you start with elements, then you would build an OsculatingElements object — and would be free to set the elements gradually, and to set the ones you knew, leaving blank the ones you didn’t know but that could be computed from them — then would pass that object to KeplerOrbit.from_elements() which would convert them to vectors?

Ideally there would be a way to combine them so when a user wants osculating elements they get back a KeplerOrbit object that has attributes for all of the elements, but also behaves like a VectorFunction object too.

If we keep the distinction that OsculatingElements knows the angles while KeplerOrbit knows the vectors, then I won’t mind if OsculatingElements can’t be a vector function; folks can always convert to a KeplerOrbit.

One current obstacle to this is that the OsculatingElements object represents a vector of orbits, while a KeplerOrbit object represents a single orbit. I do think it would be possible to vectorize the function keplerlib.propagate so that it could propagate positions for multiple objects to multiple times, but this probably shouldn't happen without the rest of Skyfield being vectorized in a similar way.

What’s an example of an operation in Skyfield which would need to be further vectorized to support the operations you’re thinking of?

JoshPaterson commented 4 years ago

If we could fix how OsculatingElements are initialized, would a system like the following have any contradiction?

OsculatingElements is initialized with and remembers the angular orbital elements.
KeplerOrbit is initialized with and remembers the vectors.
Conversions take place through methods with names like:
    OsculatingElements.from_orbit()
    KeplerOrbit.from_elements()

So if you start with elements, then you would build an OsculatingElements object — and would be free to set the elements gradually, and to set the ones you knew, leaving blank the ones you didn’t know but that could be computed from them — then would pass that object to KeplerOrbit.from_elements() which would convert them to vectors?

I like that a lot! I think that would be more intuitive than the current arrangement.

What’s an example of an operation in Skyfield which would need to be further vectorized to support the operations you’re thinking of?

I think the way I explained this was pretty confusing, sorry about that. I've seen you talk in a few places of Skyfield adding the ability to return arrays where one dimension was different times and the other dimension was different objects. When I said "this probably shouldn't happen without the rest of Skyfield being vectorized in a similar way" I was presuming that that feature should be added to those other places in Skyfield to work any kinks out before it gets added to KeplerOrbit objects, but perhaps I was wrong to assume that?

brandon-rhodes commented 4 years ago

I like that a lot! I think that would be more intuitive than the current arrangement.

Is it clear enough what this would look like that you could try a quick draft PR, or should I suggest a rough draft of what the API would look like, to save you a round-trip if you are imagining something a bit different?

…I was presuming that that feature should be added to those other places in Skyfield…

Happily, Skyfield ignores any dimensions above the first dimension, which is usually meaningful like "x y z" or "ra dec distance", and the second dimension, which is usually time. But any further dimensions should work seamlessly, as long as both time and coordinates are extended to the same number of dimensions. I should probably document this sometime soon. It will be a couple of weeks before I have time to do so, alas. But: my belief is that Skyfield is ready and users even today can set up whatever additional dimensions they want, and have them work seamlessly with the existing code (unless there are edge cases sitting around that need to be solved?).

JoshPaterson commented 4 years ago

Is it clear enough what this would look like that you could try a quick draft PR, or should I suggest a rough draft of what the API would look like, to save you a round-trip if you are imagining something a bit different?

Yes, this feels clear enough to make a draft PR.

any further dimensions should work seamlessly, as long as both time and coordinates are extended to the same number of dimensions.

Interesting! I think KeplerOrbit objects are similar in that once keplerlib.propagate is vectorized the KeplerOrbit doesn't need to change, the arrays are just passed right through.