[RFC] API for "zipped" parallel computation

I'd like implement an additional API for "zipped" parallel computation with the SatrecArray. The current computation uses the "outer product" of the Satrecs and the julian dates, generating a total of len(satrec_array) * len(jds) state vectors. This is not efficient if when want each Satrec propagated to a different time, e.g., we want to compute

states = [satrec.sgp4(jd, fr) for satrec, jd, fr in zip(satrec_array.satrecs, jds, frs)]

Our options for this use case are:

Use a for loop in python as above. This means we loose the C++ accelerated parallelism.
Use the C++ vectorized computation---this requires N^2 computations when we only need N.

The changes required to implement a new method are fairly minimal and I'd be happy to make a PR. Basically, I propose that we have a new API on SatrecArray. In the python version, it would look like this:

class SatrecArray:
    # ...
    def sgp4_zip(self, jds, frs):
        """Compute positions and velocities for the satellites in this array each at a different time.

        Given NumPy scalars or arrays ``jd`` and ``fr`` supplying the
        whole part and the fractional part of one or more Julian dates,
        return a tuple ``(e, r, v)`` of three vectors that are each as
        long as ``jd`` and ``fr``:

        * ``e``: nonzero for any dates that produced errors, 0 otherwise.
        * ``r``: (x,y,z) position vector in kilometers.
        * ``v``: (dx,dy,dz) velocity vector in kilometers per second.

        This method differs from ``SatrecArray.sgp4`` in that the input ``jd`` and ``fr``
        must have the same number of elements as ``Satrec``s in the array, and each 
        output has ``len(jd)`` elements instead of ``len(jd) * len(satrecs)``.
        """

        results = []
        z = list(zip(self._satrecs, jd, fr))
        for satrec in z:
            results.append(satrec.sgp4(jd_i, fr_i))
        elist, rlist, vlist = zip(*results)

        e = self.array(elist)
        r = self.array(rlist)
        v = self.array(vlist)

        jdlen = len(jd)
        mylen = len(self._satrecs)
        e.shape = (mylen)
        r.shape = v.shape = (mylen, 3)

        return e, r, v

As you can see, the code above is almost identical to the existing code in model.py. I believe it's similarly easy to add a method to wrapper.cpp to support this API.

Feedback requests

In light of this, I'd like feedback on the following:

Would you welcome a pull request for this? And, assuming that the answer is affirmative,
How do you feel about the name sgp4_zip? I'm not super fond of it, but I'm having trouble thinking of a better one. Perhaps sgp4_vectorized, but that's a little overloaded since both methods on SatrecArray are vectorized.
I know that this would require an update to sgp4/__init__.py for documentation, sgp4/model.py for python implementation, sgp4/tests.py for tests, sgp4/wrapper.py for delegation to the C-implementation, and finally extension/wrapper.cpp for the C-implementation itself. Is there anything else to be done?
Anything I forgot?

Thanks in advance.

I have just a few minutes today to look at this issue, so I've not yet reached a conclusion, but wanted to at least add a note that this is an interesting idea, and I will ponder whether it might be added in the future. I'm a bit dismayed already by the amount of code that the array feature forced into the codebase, so my first reaction is to not want to add more. I wonder if this could be added as an additional case handled by the existing loop, or whether it really requires an additional copy of the loop logic. Now I kind-of wish that I had looked into NumPy array broadcasting as the native approach here, since then users could select the behavior they want by either having their date array have shape (1, N,), in which case a convolution with a satellite array of dimension M would produce the same M×N product that's returned today, but a shape (M,) would have broadcast naturally by assigning one date to each satellite.

Maybe the library could switch to that approach under the covers, then also keep offering the current API as a wrapper around it.

brandon-rhodes / python-sgp4

[RFC] API for "zipped" parallel computation #114

Feedback requests