skyfielders / python-skyfield

Elegant astronomy for Python
MIT License
1.42k stars 213 forks source link

Add a function to Geocentric to produce velocity in earth-fixed ITRF #476

Closed element15 closed 3 years ago

element15 commented 3 years ago

When working with satellites, it is often necessary to have the velocity of a given satellite expressed in ECEF coordinates. Such a function would essentially mirror Geocentric.itrf_xyz(), but would produce a Velocity object. The process for transforming the velocity in this manner are identical to those for position, so the existing rotation matrix Time.M and Time.gast are sufficient.

I can submit a PR implementing this, but I'd like feedback on if there is a better place for this functionality to live (and if this feature is welcome in the first place). My first thought is to add an additional function called Geocentric.itrf_xyz_velocity(), but I'm brand new to this tool, so there may be a better place.

brandon-rhodes commented 3 years ago

That sounds like a useful addition! It might take me a couple of weeks for me to answer in detail, as I'm currently working on Skyfield's handling of polar motion corrections, but I'll go ahead and mark this as a feature request and say that, yes, I think we should work out how this should fit into Skyfield.

Meanwhile: have you worked out how to produce this value yourself, so your project is unblocked? Or is one of the rotations not quite producing the output you need? If there's any snag, let me know, so at least you have the numbers you need quickly whether or not they come from inside of Skyfield!

element15 commented 3 years ago

Great, and yes. I'm basically implementing the function I descried in my own module. It isn't pretty, but it's a heck of a lot better than manually interfacing with SGP4 myself like I was doing before.


import numpy as np
from skyfield.functions import (mxv, rot_z)
from skyfield.constants import tau
from skyfield.units import Velocity
def itrf_xyz_vel(geo):
    """Adapted from skyfield.positionlib.Geocentric.itrf_xyz()"""
    t = geo.t
    au_per_d = mxv(t.M, geo.velocity.au_per_d)

    spin = rot_z(-t.gast * tau/24.0)
    au_per_d = mxv(spin, np.array(au_per_d))

    return Velocity(au_per_d)
'''
brandon-rhodes commented 3 years ago

It always makes me pause for a moment that we never account for the rate at which t.M rotates — which really ought to make it into the velocity too — but I suppose it's such a small effect that it might even be below machine precision and something the vectors can't represent.

I'm glad you have a local implementation, and I'll return to this when I get the chance — at the latest, by mid-December, when I return from a trip.

brandon-rhodes commented 3 years ago

There! Per the commit shown above, the next version of Skyfield should allow positions to be turned into ITRS positions and velocities. Instead of adding yet another method — which eventually would have required dozens of methods, if we continued down the dark path of two methods (position and velocity) per reference frame (of which we have at the moment at least a half-dozen) — I have pivoted to a general mechanism that should let us support as many frames as we want.

This will be released with the next version of Skyfield, probably in about two weeks, but you can try it out ahead of time with:

pip install https://github.com/skyfielders/python-skyfield/archive/master.zip

Enjoy!

element15 commented 3 years ago

Ah, that's a much cleaner way to do it, and it seems to work splendidly. Thank you for implementing this so quickly!

I do find the choice of name _twist_at a bit puzzling, though. Is twist distinct from rate? I understand that Skyfield is primarily an astronomy library, so perhaps that nomenclature comes from astronomy? (My background is mostly in geocentric navigation systems.)

brandon-rhodes commented 3 years ago

Thanks for trying it out ahead of time! I’m glad it seems to produce useful numbers.

“Twist” is, yes, distinct from rate, and is a private term of my own since I can’t find an official term for the matrix that method produces. A toolkit like SPICE offers routines that handle a 6×6 matrix XFORM that looks like (to quote randomly from their zzcorsxf.c file):

/*                        -               - */
/*                       |         :       | */
/*                       |  R(t)   :   0   | */
/*                       |........ :.......| */
/*                       |         :       | */
/*                       | d(R)/dt :  R(t) | */
/*                       |         :       | */
/*                        -               - */

You can then multiply a 6-element vector x y z dx/dt dy/dt dz/dt by this matrix to transform the vector from, say, GCRS into the ITRS.

But that approach seems to require a little bit of extra work. If I have, say, an Earth location, and an x y z vector for it in the ITRS, then I can fairly readily produce an ITRS vector for the direction that the point is moving in a non-rotating reference frame — a vector that looks something like -y x 0 multiplied by the angular velocity of the Earth. That’s the matrix I’m inexactly calling the “twist” — “the velocity of the ITRS position in that reference frame, if the ITRS reference frame were suddenly to stop spinning.”

Which is different from the d(R)/dt matrix in SPICE, which instead is the “twist” rotated into the GCRS to create a pure-GCRS matrix about the velocity. Doing this matrix × matrix operation makes the SPICE 6×6 matrix very easy to apply: you just do a single matrix multiply, and get the position and velocity out.

But you can also avoid the matrix × matrix multiply to bring the “twist” all the way into the GCRS, by doing two matrix multiplies in a row instead of trying to cleverly fit it all into a single 6×6, as you can see from the code in the commit referenced above: if you apply the twist before then doing the final rotation into the target reference frame, then both the ITRS-relative velocity and the velocity due to the reference frame itself are naturally rotated together as a single quantity.

So for this first operation version of reference frame conversion, I decided to try to be clever and avoid the expense and rounding of an extra matrix × matrix multiply. We’ll see how it goes; there might be reference frames where that ‘twist’ matrix is not as easy to build as it is for the ITRS? And let me know if you are aware of any official name for it; one argument against its use in the code might be simply how difficult it is to explain!

element15 commented 3 years ago

So looking through framelib with that description in mind, the the twist looks to me to be the skew-symmetric angular velocity. That's a bit of a mouthful for a member name, though. The matrix _itrs_angvel_matrix would be what I would describe as -\Omega_{ie}^i where \Omega describes some skew-symmetric matrix, and the i and e frames are GCRS and ITRS, respectively.

The biggest challenge I see with trying to use the derivative of the rotation matrix is estimating that derivative. Unless I'm missing something, that would have to be a finite estimate of the derivative, which may introduce more error than the existing solution. Keep in mind that the approximation d(R)/dt_i^e = -R_i^e \Omega_{ie}^i becomes exact in the limit as dt -> 0, so unless you're trying to capture some irregularities in the earth's rotation rate, it may not be worth the trouble.

All that said, if this idea gets extended to arbitrary rotating reference frames which may not have an approximately constant rotation rate, the derivative method may be more extensible.

Thanks for the thorough answer on this.

brandon-rhodes commented 3 years ago

The biggest challenge I see with trying to use the derivative of the rotation matrix is estimating that derivative. Unless I'm missing something, that would have to be a finite estimate of the derivative, which may introduce more error than the existing solution.

The SPICE library seems to calculate the derivative of the rotation matrix by getting the skew-symmetric angular velocity matrix, and multiplying it by the rotation matrix R that rotates GCRS → ITRS. I think that result is what it plugs into the lower-left corner of its 6×6 state-transformation matrix. Would that produce the matrix you are thinking of, without incurring the need to perform a finite estimate?

element15 commented 3 years ago

That's exactly what I'm thinking. I've included the relevant equation from "Principles of GNSS, Inertial and Multisensor Integrated Navigation Systems" by Paul D. Groves.

image

In this case C is a rotation matrix, \Omega is a skew-symmetric angular velocity, and \alpha and \beta are two arbitrary frames. He cites another textbook for the full derivation, "Applied Mathematics in Integrated Navigation Systems" by R. M. Rogers, but I don't have that one, so the arXiv paper I linked to before shows a similar derivation of these identities.

brandon-rhodes commented 3 years ago

Thanks for providing that additional background! Instead of _twist_at(t), would something like _angular_velocity_at(t) or even more explicitly _angular_velocity_matrix_at(t) better suggest the properties of the return value? I could add a docstring explaining that it’s skew symmetric, if I didn’t want to add two further words to the name. :slightly_smiling_face:

element15 commented 3 years ago

I think calling it _angular_velocity_at(t) and including a note about being skew-symmetric in a docstring may be the right move.

To that end, I believe the current implementation may have a sign error. All the skew-symmetric matrices we're dealing with here are a special type which encodes a vector cross product. For some vector [a, b, c], the corresponding skew-symmetric form is given by:

/  0   -c    b \
|  c    0   -a |
\ -b    a    0 /

So in the specific case of the earth angular velocity in ITRS (or in GCRS for that matter), the angular velocity is in the positive z direction, so _itrs_angvel_matrix should be:

array((
    (0.0, -DAY_S * ANGVEL, 0.0),
    (DAY_S * ANGVEL, 0.0, 0.0),
    (0.0, 0.0, 0.0),
))

This means that in frame_xyz_and_velocity(), you'll need to subtract mxv(V, r) rather than adding it to v. This is consistent with the identity posted before which includes a minus sign.

brandon-rhodes commented 3 years ago

Thanks for the detailed suggestions! You have sent me back into the SPICE library for my morning reading, to work out what they call this matrix, and to learn about its proper sign — since my goal here is for Skyfield to follow their approach fairly closely. Here is their Required Reading document on rotations:

https://naif.jpl.nasa.gov/pub/naif/toolkit_docs/FORTRAN/req/rotation.html

And here are the docs to two specific routines whose source code I have just reviewed:

The second source file defines a crucial subroutine called in the first source file — which happens to be EUL2XF(), rather than its inverse for which the file is named.

  1. Upon review, it looks like the SPICE library does add, not subtract, the two products that Skyfield calls mxv(R, v) and mxv(V, r) when SPICE wants to arrive at the output rotated velocity vector. This results from their 6×6 transformation matrix being multiplied, without having any signs changed, with a 6-vector x y z dx/dt dy/dt dz/dt.
  2. The lower-left 3×3 corner of their 6×6 matrix is occupied by a matrix that they call dR/dt, and that in the Skyfield code would be called mxm(V, R) — that is, it’s occupied by my “spin” matrix V rotated into the target reference frame. Do they have a name for what I call V? Yes, they do: they call it (drum roll—) DRDTRT! Which evidently is their shorthand for (dR/dt) × Rᵀ. Which looks more like a description than a real name? You can see down at the bottom of xf2eul.f where they run CALL MXM ( DRDTRT, R, DRDT ) to produce the 3×3 matrix that winds up in the lower-left of their 6×6: multiplication by R "takes away" the RT-ness of the matrix.
  3. Therefore, I should probably switch from _twist_at() to something like _dRdt_RT_at() so that the name I’m using at least resembles some other existing working implementation of this transform, instead of being entirely idiosyncratic.
  4. Working through their derivation, they arrive at an expression for (dR/dt) × Rᵀ that looks like this:
C            T
C     dR/dt*R  = dALPHA/dt OMEGA
C                                A
C
C              + dBETA/dt  [ ALPHA ] OMEGA  [ -ALPHA ]
C                                   A     B           A
C
C              + dGAMMA/dt [ ALPHA ] [ BETA ] OMEGA [ -BETA ]  [-ALPHA]
C                                   A        B     C         B         A

To imagine our simple earth-angular-velocity case, I believe we can take dBETA/dt and dGAMMA/dt to be zero, wiping out both the second and third terms, and reducing our result to dALPHA/dt OMEGA 3 where dALPHA/dt is DAY_S * ANGVEL and where they define OMEGA as:

C         [   0      D_3n    -D_2n  ]
C         |  -D_3n    0       D_1n  |
C         [   D_2n  -D_1n      0    ]

Thus, in the case OMEGA 3 (3 = the z-axis in their notation), this is a matrix whose top row has a positive term in the middle and whose second row starts with a negative term — exactly as in my _itrs_angvel_matrix. So I think my code is at least following their own approach with respect to sign, whether that approach is idiosyncratic mathematically or not?

Here’s their source file, so you don’t have to download their whole Fortran distribution yourself if you want to follow their derivation and see where a sign difference might have crept in between their approach, which uses rotation matrices, and your own approach that instead considers a matrix cross product directly — they do note that:

C     The product dR/dt*R  is a skew symmetric matrix and hence can
C     be represented as a cross product,

— so that might provide a starting point for comparing your cross product with theirs. The file:

xf2eul.f.txt

Let me know if you work out why there’s a difference in sign!

element15 commented 3 years ago

I think the discrepancy arises from how OMEGA_n is defined. From lines 501-504 of xf2eul.f:

C     (D_ij   denotes the Kronecker delta.)  Note that OMEGA * v
C                                                           n
C     yields -e  x  v  for all vectors v.
C              n

If I'm reading that right, they're defining their skew-symmetric matrix in this case to correspond to a negative cross product (or, equivalently, a cross product taken from the right). Referring to the resulting matrix as dR/dt*R^T or DRDTRT I find pretty helpful in this regard. If we apply this definition to the set of identities I posted yesterday from the Groves text, we see that the rightmost version of the substitution is being used, or, more exactly, dR/dt = Omega_ei^e * R.

Note that the angular velocity here is Omega_ei^e rather than Omega_ie^i, which gets to the heart of the difference1. The skew-symmetric angular velocity we're dealing with here (DRDTRT, _twist_at(), V, etc.) is describing the angular velocity of the inertial frame relative to and coordinatized in the destination frame. More specifically for us, this means that what _itrs_angvel_matrix is describing is the angular velocity of GCRS measured in the ITRS frame, which is indeed oriented in the negative z direction.

A bit further down they expand a bit on the skew-symmetry of DRDTRT (lines 518-534):

C                        T
C     The product dR/dt*R  is a skew symmetric matrix and hence can
C     be represented as a cross product,
C               T
C        dR/dt*R  V  = W x V
C
C     for all vectors V, provided that
C
C                       T
C        W(1) =  dR/dt*R  (3,2)
C
C                       T
C        W(2) =  dR/dt*R  (1,3)
C
C                       T
C        W(3) =  dR/dt*R  (2,1)
C

Assuming those indices are row-major (I vaguely recall matrices sometimes being column-major in FORTRAN [?]), this agrees with my earlier sign convention for denoting skew-symmetry as a cross product. I think the sign convention subtleties get a little lost in the xf2eul derivation because of all the scaffolding to deal with Euler angles.

Ultimately I think you're right to want to stay in agreement with an established implementation like SPICE, but an explanation in the docs about exactly which reference/resolving frame is being used for what is currently called _twist_at() could be beneficial, especially as this idea may be extended to arbitrary rotating frames.

Thanks again for unpacking all this with me. Chasing sign conventions I'm sure isn't the most thrilling use of your efforts here.

Edit: Just saw that commit from this morning after posting this.


1 I'm not sure I ever fully explained the notation here. An angular velocity Omega_ab^c describes the angular velocity of frame b with respect to frame a using the coordinates of frame c. In many cases, a and c will refer to the same frame.

brandon-rhodes commented 3 years ago

… an explanation in the docs about exactly which reference/resolving frame is being used for what is currently called _twist_at() could be beneficial, especially as this idea may be extended to arbitrary rotating frames.

For the moment I will see if the renaming helps me keep it straight; happily, I still have a _ in front of the same, so I can keep improving the name if the next time I return to it I’m again confused but come up with an improvement. I’ll think about a docstring, and probably wait until I have a bit more experience with reference frames and thus more examples to draw on; sometime soon I'll probably also transition the CIRS support to being a new frame object.

The skew-symmetric angular velocity we're dealing with here (DRDTRT, _twist_at(), V, etc.) is describing the angular velocity of the inertial frame relative to and coordinatized in the destination frame.

Yes, I think that puts it well. To compare with my way of thinking about it — my governing principle in preferring the current sign is, I suspect, this code from frame_xyz_and_velocity():

        r = mxv(R, r)
        v = mxv(R, v)
        at = getattr(frame, '_dRdt_times_RT_at', None)
        if at is not None:
            V = at(self.t)
            v += mxv(V, r)

Since R gets applied in a "forward" direction (without a sign change and without transposition), I like that V gets applied the same way (without sign change, without transposition). But what, then, does V physically mean? Matrix V must mean "how a vector r rotated into the target reference frame, that's stationary in the inertial frame, will move in that new frame." So if the vector r is, say, a point on the equator out in the direction of the vernal equinox and thus roughly [1 0 0] in Earth radii, then from the point of view of the ITRS it’s a vector spinning backwards in the opposite direction to the Earth’s rotation.

Moving from the GCRS into the ITRS, then, means that everything in the universe starts rotating in the opposite direction from the Earth’s rotation: New York is heading east around the axis, so the Sun does the opposite and rises in the east and sets in the west.

Thus, exactly as you say: its sign is reversed with respect to the direction of rotation of the frame itself.

Thanks again for unpacking all this with me. Chasing sign conventions I'm sure isn't the most thrilling use of your efforts here.

It’s surprisingly helpful to discuss points of Skyfield’s design with other folks familiar with the math (often, more familiar with the math than I am!), and small things like sign conventions can wind up being terribly inconvenient later in a project’s lifetime if chosen poorly. So thanks for the discussion, it’s been helpful, and better grounded me in the math.