ChristopherRabotin commented 5 years ago

Foremost, thanks for coding and maintaining jplephem. It's really an excellent library. In fact, I've implemented my own ephemeris reader using some of your work.

As demonstrated by the following example code, there is a position difference between spiceypy and jplephem on the order of 1e-5km. The velocity difference is on the order of 1e-12 km/s. The velocity error is small enough to not impact my work, but the position error is a bit more concerning. In general however, I would expect the difference to be machine precision (1e-16 on my computer).

I dove into the SPK Required Reading document to identify any differences in the way the position is computed, but that effort turned out fruitless.

Do you have any hint as to what could cause that difference? I'd be happy to dive into it myself.

Example

Code

from jplephem.spk import SPK
import spiceypy as sp

# These numbers are computed using hifitime https://github.com/ChristopherRabotin/hifitime/
# This date time corresponds to 07 February 2002 at midnight TAI.
jde_et_days = 2452312.5007428810931742
jde_et_secs = 66312064.18493939

# SPICE
sp.furnsh('de438s.bsp')
moon_em_bary_sp, _ = sp.spkezr('301', jde_et_secs, 'J2000', "NONE", '3')
# earth_moon_out

# jplephem
s = SPK.open('de438s.bsp')
moon_sg = s.pairs[(3, 301)]
moon_em_bary_jp = moon_sg.compute_and_differentiate(jde_et_days)

print("position delta (km):", moon_em_bary_sp[:3] - moon_em_bary_jp[0])
print("velocity delta (km/s):",
      moon_em_bary_sp[3:] - moon_em_bary_jp[1]/86400.)

Output

> ipython delta.py
position delta (km): [ 1.24332291e-05 -2.63471156e-06 -2.37874337e-06]
velocity delta (km/s): [7.56694707e-12 3.09001713e-11 1.29118383e-11]

brandon-rhodes commented 5 years ago

Foremost, thanks for coding and maintaining jplephem. It's really an excellent library. In fact, I've implemented my own ephemeris reader using some of your work.

You're welcome, and I'm very glad to see it inspiring new work! That's Rust code?

As demonstrated by the following example code, there is a position difference between spiceypy and jplephem on the order of 1e-5km.

Now that I've caught up following PyColorado, I've taken a few minutes to look at your code (thanks for providing a full working example!) and have not found any obvious reasons why these two routines should give different results. The difference relative to the size of the vector is around 3e-11 which, I agree, is much bigger than I would expect for two implementations of the same polynomial.

The agreement between Skyfield, which uses jplephem, and the NOVAS library is considerably better than that, so I'll be interested to learn if you find any problem with the jplephem routines.

I'd be happy to dive into it myself.

Yes, please do so! You should be able to trace essentially the same math in both libraries, and hopefully find the moment at which they diverge. Good luck, and I look forward to hearing the result!

ChristopherRabotin commented 5 years ago

What do you mean by the following?

The difference relative to the size of the vector is around 3e-11

I have a vector of size 6 (three entries for position, and three for velocity).

I also just noticed that I didn't provide a link to the de438s.bsp file: https://naif.jpl.nasa.gov/pub/naif/JUNO/kernels/spk/de438s.bsp (~21 MB).

What DE file did you use to get 3e-11 ?

I haven't yet had the time to dive into the implementation of SPICE and jplephem. I am considering reaching out the SPICE team directly (I contacted them maybe eighteen months ago when I needed help reading BSP files).

brandon-rhodes commented 5 years ago

What do you mean by the following?

I meant, if you take the length of the "delta" vector you printed out and divide by the length of the vector itself, using the same DE file your script uses. I would call 3 the "number of dimensions" of the position and velocity vectors. I'll try to avoid the term "size" and use "length" or "number of dimensions" instead.

ChristopherRabotin commented 4 years ago

I might have some time to look into this in the next couple of weeks, so I'm posting some information here for reference.

The CSPICE function which computes the geometric states is spkgeo. That function seems to only handle the tree of centers. The function spkpvn retrieves the coefficients. Then spkr03.c grabs the segment descriptor. There are more calls after that which seem related only to retrieving the segment descriptor, not the actual computation. Yet somehow, after line 744 of spkgeo, we have the state. There is some weird syntax here which might be from the FORTRAN to C transliteration.
The full code of SPICE is available here, which allows for easy searching. In general, a given function xyz__ is defined at the root of the src/cspice folder directly in xyz.c, unless it's a "private" function in which case it'll be in zzxyz.c. Documentation is very helpful.

brandon-rhodes commented 4 years ago

@ChristopherRabotin — As February wanes, I wanted to double-check how your investigation is going. I also wanted to offer a code snippet, because I think my answers back in September were far too brief. Instead of trying to explain what I meant in words, I should have just shown it in code. So, here are a few extra lines of code that you can paste in right at the bottom of the script you provided above (and, thanks again for providing a good example script with your question!):

rlength = (moon_em_bary_jp[0]**2).sum()**0.5
vlength = (moon_em_bary_jp[1]**2).sum()**0.5
rdiff = moon_em_bary_sp[:3] - moon_em_bary_jp[0]
vdiff = moon_em_bary_sp[3:] - moon_em_bary_jp[1]/86400.
print("length of position vector:", rlength)
print("length of velocity vector:", vlength)
print("position relative error:", rdiff / rlength)
print("velocity relative error:", vdiff / vlength)

The output should answer your question that I carelessly left dangling, "What DE file did you use to get 3e-11?":

length of position vector: 383219.5205694992
length of velocity vector: 86322.11183493689
position relative error: [ 3.22947206e-11 -6.84345661e-12 -6.17870491e-12]
velocity relative error: [8.72581538e-17 3.56314417e-16 1.48888686e-16]

I am dividing each error by the length of the corresponding vector, because the limited number of floating point digits get "used up" by digits on both sides of the decimal point. So, for example, the first error 1.24332291e-05 that you print would be expected if the number had 16 - 5 = 11 digits ahead of the decimal point, but would be a worrying error if the number had only 1 or 2 digits to the left of the decimal point.

The result is surprising. The velocity looks perfect — the difference between the libraries is down at the very basement of 64-bit floating point, in the noise: a full 16 digits of agreement!

But the position is about 10,000 times worse in agreement.

I wonder how the velocity could be perfect but the position off by that large an amount. I'll ponder and let you know if I think of anything. Otherwise, the difference will hopefully show up as you do your deep dive into the code!

ChristopherRabotin commented 4 years ago

Thanks for the follow up!

I'm been stupidly busy with other tasks and have not looked much into the problem since the last messages I wrote here. However, it was brought to my attention (again) a week ago that Ephemeris Time, according to SPCIE, is actually TBD and not TDT. There is a minute difference in the definition of those time systems (and hifitime currently does not support TBD). Therefore, I suspect that one source of error would be the time conversion for the times I used in the script above.

Just to make sure I understand your previous comment, are you saying that we should divide the error by the length of the vector because the floating mantissa representation may overblow the actual errors? If so, that would make sense in my mind because the velocity numbers are small compared to the position numbers.

brandon-rhodes commented 4 years ago

Therefore, I suspect that one source of error would be the time conversion for the times I used in the script above.

Could you clarify how that error would creep in? Re-reading your code, it looks like you take an exact moment in TDB and pass it to both libraries, both of which expect TDB as their argument. I do not see any other timescales involved?

Just to make sure I understand your previous comment, are you saying that we should divide the error by the length of the vector because the floating mantissa representation may overblow the actual errors?

Yes, I am suggesting dividing by the length because 64-bit floats only remember about 16 decimal digits of precision. So, for example, the number 1e15 can be accurate to within 1.0, but the number 1e16 cannot “see” numbers that small:

n = 1e15
n += 1.0
print(n)
n += 1.0
print(n)
n += 1.0
print(n)

n = 1e16
n += 1.0
print(n)
n += 1.0
print(n)
n += 1.0
print(n)

Output:

1000000000000001.0
1000000000000002.0
1000000000000003.0
1e+16
1e+16
1e+16

So error always needs to be compared to the whole length because, in a number like the velocity we have here, it is simply not possible to carry further digits of information past the e-11 limit observed here.

ChristopherRabotin commented 4 years ago

@brandon-rhodes We're looking into this at the moment. I was wondering if you recalled where you found the original algorithm for the Chebyshev interpolation as done in the SPK files. I can't seem to find anything in the NAIF documentation at the moment.

I did however find this code from Juan Arrieta (from Nabla Zero Labs): https://gist.github.com/arrieta/c2b56f1e2277a6fede6d1afbc85095fb .

brandon-rhodes commented 4 years ago

@ChristopherRabotin — I used the Chebyshev routines in the USNO’s “NOVAS” library:

https://github.com/brandon-rhodes/python-novas/blob/99de8ed096b56836941b27203543bc4db737b475/Cdist/eph_manager.c#L794

Skyfield was designed to return the same coordinates as NOVAS given the same time and ephemeris, where "same" means "to roughly machine precision". You can find tests in Skyfield that keep the library working to very close tolerance versus NOVAS:

https://github.com/skyfielders/python-skyfield/blob/master/skyfield/tests/test_against_novas.py

ChristopherRabotin commented 4 years ago

After lots of debugging by @mrpossoms , it seems like the ET seconds I was using initially was wrong. The value should have been 66312064.184926435351 (instead of 66312064.18493939).

The deltas are now:

position delta (km): [-7.15954229e-08  1.51921995e-08  1.37079041e-08]
velocity delta (km/s): [-4.34097203e-14 -1.77996506e-13 -7.43294315e-14]

brandon-rhodes commented 4 years ago

Thank you for following up, @ChristopherRabotin, and I'm glad that one mystery at least is solved.

brandon-rhodes / python-jplephem

Position difference between spiceypy and jplephem (order of 1e-5 km); velocity OK #33

Example

Code

Output