Closed Bernmeister closed 3 years ago
Try printing t.utc_jpl()
so that we can find out the date on which you are trying to generate a Sun and Earth position. Thanks!
Adding the above print results in:
A.D. 2020-Sep-16 00:00:00.0000 UT
Thank you, that eliminates the first possibility (that it was so far in the past or future as to exceed the date range).
Adding:
barycentric = body.at(t)
print(barycentric.distance())
— just before the fatal call, I see:
nan au
So for the moment you could protect your code by checking whether the position was a not-a-number floating point value before proceeding.
And we can in the future improve Skyfield’s behavior here, I think. The routine that tries to back-date positions for light travel time, instead of reaching impossibly far back in time in the case of a nan
, should just try producing a nan
as its output. Let’s leave this issue open until we can make that adjustment!
Well, how interesting. Because integers at the machine level cannot represent nan
, it looks like when nan
gets converted deep inside jplephem
into an integer index into the .bsp
file, it turns into the integer (drum roll—)
-9223372036854775808
Which then indexes outside of the bounds of the file and produces the error.
Looking at _correct_for_light_travel_time()
, it will have to become more complicated if it’s to handle nan
values that might appear in the middle of an array that's otherwise in good shape.
I do want Skyfield to become transparent with respect to nan's, so that a big vector with a few nan's inside doesn't cause any problems. But I'll have to think about this one a bit, and see if there's a way to tackle it in the light-backdating case that won't cause too much expense for the 99.99% case where none of the values are nan
.
Dunno if this adds to the diagnosis or not, but noticed these messages along with the exception:
/usr/local/lib/python3.6/dist-packages/jplephem/spk.py:227: RuntimeWarning: invalid value encountered in double_scalars index2, offset2 = divmod(tdb2 * S_PER_DAY, intlen)
/usr/local/lib/python3.6/dist-packages/jplephem/spk.py:228: RuntimeWarning: invalid value encountered in divmod index3, offset = divmod(offset1 + offset2, intlen)
Dunno if this adds to the diagnosis or not, but noticed these messages along with the exception…
I see only one mention (!) of that on the Web, and it seems to be in exactly the same situation, where divmod()
is given a nan
:
https://github.com/numba/numba/pull/5619#issuecomment-658091806
So I think those messages are in line with our hypothesis that nan
is among the values the SPK routine is being asked to deal with.
Is this is something that should/could be handled within Skyfield, or rather I just wrap a try/except around the code?
Here are two ways of detecting the situation for now. First, after generating the position, you could check whether it has nan
values inside:
from numpy import isnan
print(isnan(barycentric.position.au).any())
This prints True
for the problem position, and could help you detect it.
An even better approach might be to simply ignore orbits like this one with no semimajor_axis_au
specified, as without any data on the size of the orbit, there is no way that a position can be computed. If you print(dataframe.loc[ name ])
in your code you will see the not-a-number value for that field. What if you loaded the file first, and removed rows without a semimajor axis?
with open("Soft00Distant.txt", 'rb') as f:
dataframe = skyfield.data.mpc.load_mpcorb_dataframe( f ).set_index( "designation", drop = False )
dataframe = dataframe[~dataframe.semimajor_axis_au.isnull()]
for name, row in dataframe.iterrows():
orbit = skyfield.data.mpc.mpcorb_orbit(row, timeScale, skyfield.constants.GM_SUN_Pitjeva_2005_km3_s2)
# then, add to `sun` and so forth
Note that in your code as it stands, I'm not sure there's any purpose in doing a set_index()
on the dataframe, as your dataframe only ever appears to have one row? Your script will already probably feel slow because you are running load_mpcorb_dataframe()
on each line instead of one the whole file, and stopping to generate a one-item index for that one row will slow it down further, in a situation where you probably want performance because of the number of lines in the file. Try it out and let me know if it helps!
I didn't realise that the file could be loaded into a dataframe in one go. Yes, I've found that comets and minor planets run an order of magnitude or two slower than PyEphem (and was going to post about a little later on). My use case is in two parts: first I download the MPC data files (one for comets and four for minor planets). For each file, I remove a comet/minor planet if it exceeds (apparent) magnitude of 15. I then write out a binary blob file containing what's left. Secondly I use the binary blob on each update of my software loop to read in this much smaller subset.
I presume what you have written above in the code sample (open the text file and load into dataframe) is what you refer to speeding things up, then I'll have a go at using that. Much appreciated!
note: I'm not a developer
As of today at least, two comets cause different kinds problems
row_neo = comets.loc['C/2020 F3 (NEOWISE)']
row_oum = comets.loc['1I/`Oumuamua']
The reasons might be that:
Both of these show problems internally at mpc.comet_orbit()
but depending on what you do, you find out that there is a problem in all kinds of different ways, sometimes a lot further down the road
Question: Would a thorough checking within mpc.comet_orbit()
even if it just prints a warning, save a lot of time and energy?
--
1791 comets loaded
skyfield.__version__: 1.23
NEOWISE test 1 (this must be trapped, otherwise crash)
(<class 'ValueError'>, ValueError('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().'), <traceback object at 0x11af1bc48>)
NEOWISE test 2 this is interesting
(<class 'ValueError'>, ValueError('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().'), <traceback object at 0x11af1bb48>)
NEOWISE test 3 because the warning is different
/Users/david/anaconda3/lib/python3.7/site-packages/skyfield/keplerlib.py:256: RuntimeWarning: invalid value encountered in double_scalars
return 2 * arctan(((1+e)/(1-e))**.5 * tan(E/2))
Oumuamua test 1 seems no problem
Oumuamua test 2 but there is
(<class 'AttributeError'>, AttributeError("'_KeplerOrbit' object has no attribute '_mu_km_s'"), <traceback object at 0x11af315c8>)
Oumuamua test 3 no sign of trouble here though we know there is...
Sum of 2 vectors:
+ Segment 'de421.bsp' 0 SOLAR SYSTEM BARYCENTER -> 10 SUN
+ <Geometric ICRS position and velocity at date t center=10 target=1I/`Oumuamua>
Oumuamua test 4 and it shows up with a different name
(<class 'AttributeError'>, AttributeError("'VectorSum' object has no attribute 'distance'"), <traceback object at 0x11a868ec8>)
comes from the following script
from skyfield.api import Loader
from skyfield.data import mpc
from skyfield.constants import GM_SUN_Pitjeva_2005_km3_s2 as GM_SUN
from skyfield.api import load
import skyfield, sys
loaddata = Loader('~/Documents/fishing/SkyData') # avoids multiple copies of large files
ts = loaddata.timescale() # include builtin=True if you want to use older files (you may miss some leap-seconds)
eph = loaddata('de421.bsp')
earth, sun, moon = [eph[x] for x in ('earth', 'sun', 'moon')]
with load.open(mpc.COMET_URL) as f:
comets = mpc.load_comets_dataframe(f)
comets = comets.set_index('designation', drop=False)
print(len(comets), 'comets loaded')
print('skyfield.__version__: ', skyfield.__version__)
row_neo = comets.loc['C/2020 F3 (NEOWISE)']
row_oum = comets.loc['1I/`Oumuamua']
print('\nNEOWISE test 1 (this must be trapped, otherwise crash)')
try:
test = mpc.comet_orbit(row_neo, ts, GM_SUN)
except:
print(sys.exc_info())
print('\nNEOWISE test 2 this is interesting')
try:
comet = sun + mpc.comet_orbit(row_neo, ts, GM_SUN)
except:
print(sys.exc_info())
print('\nNEOWISE test 3 because the warning is different')
comet = sun + mpc.comet_orbit(row_oum, ts, GM_SUN)
print('\nOumuamua test 1 seems no problem')
try:
test = mpc.comet_orbit(row_oum, ts, GM_SUN)
except:
print(sys.exc_info())
print('\nOumuamua test 2 but there is')
try:
test = mpc.comet_orbit(row_oum, ts, GM_SUN)
print(test)
except:
print(sys.exc_info())
print('\nOumuamua test 3 no sign of trouble here though we know there is...')
try:
bary = sun + mpc.comet_orbit(row_oum, ts, GM_SUN).at(ts.now())
print(bary)
except:
print(sys.exc_info())
print('\nOumuamua test 4 and it shows up with a different name')
try:
bary = sun + mpc.comet_orbit(row_oum, ts, GM_SUN).at(ts.now())
print(bary.distance())
except:
print(sys.exc_info())
@brandon-rhodes To give an update, your suggestion to use
dataframe = dataframe[ ~dataframe.semimajor_axis_au.isnull() ]
does indeed prevent the exception
skyfield.errors.EphemerisRangeError: ephemeris segment only covers dates 1899-07-28 23:59:18Z through 2053-10-08 23:58:51Z UT
from occurring. Happy for this ticket to be closed, unless you intended to weave some magic within Skyfield.
Also I did try your suggestion of loading the dataframe in once, but it didn't make that much difference to speed (but I'll tackle that in another ticket so that this can be closed at your discretion).
@Bernmeister — Thanks for following up! I have updated the docs to describe filtering the dataframe to remove invalid orbits, and hopefully any users who miss that advice will be led here or to the docs when they search for the exception name.
Oh, and, for the record, I believe that this line — which is missing its semimajor axis — was the culprit from Soft00Distant.txt
:
K12D30R 7.1 0.15 K194R 0.04515 195.57163 341.47709 77.98645 0.9909325 0.00001530 MPO434970 209 8 2000-2014 0.26 M-v 38h MPC 0000 2012 DR30
On running the test below:
I get the following:
skyfield.errors.EphemerisRangeError: ephemeris segment only covers dates 1899-07-28 23:59:18Z through 2053-10-08 23:58:51Z UT
which occurs during both the sun and earth observation.
On looking at the downloaded data, there is a piece of data missing from one field when compared to lines before/after. I presume this missing field is causing the exception. Is this is something that should/could be handled within Skyfield, or rather I just wrap a try/except around the code?