skyfielders / python-skyfield

Elegant astronomy for Python
MIT License
1.41k stars 211 forks source link

faster way to observe()? #496

Closed nkelhos closed 3 years ago

nkelhos commented 3 years ago

I've got code that calculates the ra/dec of an asteroid from the position of a satellite, but it runs slowly (~0.15sec per observe()), and I was wondering if my code could be rewritten in a faster way. Eventually, I'd like to be able to run it for a large project that would need many observe() calls (10^10 observe() calls per asteroid, for 500k asteroids).

import io
from skyfield.api import load
from skyfield.data.mpc import load_mpcorb_dataframe
from skyfield.data.mpc import mpcorb_orbit
from skyfield.constants import GM_SUN_Pitjeva_2005_km3_s2 as GM_SUN
from skyfield.toposlib import Topos

sc_lats  = [...] # list of satellite latitudes, degs
sc_lons  = [...] # list of satellite longitudes, degs
sc_alts  = [...] # list of satellite altitudes, meters
sc_times = [...] # list of times, astropy.time.Time objects

# asteroid Eunomia from Minor Planet Center's MPCORB.DAT
ast    = 15
a_line = '00015    5.2   0.15 K20CH  60.84584   98.61793  292.93525   11.75338  0.1863457  0.22921812   2.6442555  0 MPO530953  2394  79 1851-2020 0.55 M-v 38h MPCW       0000     (15) Eunomia            20200107'

eph    = load( 'de430.bsp' )
sun    = eph['sun'  ]
earth  = eph['earth']
a_line = io.BytesIO( a_line.encode('utf-8') )
a_info = load_mpcorb_dataframe( a_line )
a_info = a_info.set_index('designation_packed')
a_info = a_info.loc['%05d'%ast]
ts     = load.timescale()
a_pos  = sun + mpcorb_orbit( a_info, ts, GM_SUN )

for i in range( len(sc_lats) ) : # 1e10 loops per asteroid
  fermi = earth + Topos( latitude_degrees=sc_lats[i], longitude_degrees=sc_lons[i], elevation_m=sc_alts[i] )
                                         # 0.00036 sec per call  (earth+Topos())
  t1    = ts.from_astropy( sc_times[i] ) # 0.00024 sec per call
  raw   = fermi.at(t1)                   # 0.0024  sec per call
  raw   = raw.observe(a_pos)             # 0.15    sec per call :(
  rdd   = raw.radec()                    # 0.00007 sec per call
  print('final ra/dec/dist:',rdd)

I'm pretty sure it would run faster if numpy arrays were used in the underlying code, but I recognize implementing those changes would be a large amount work for the devs. Is there any way I can speed up the quoted code?

JoshPaterson commented 3 years ago

One thing that should speed this up a bit is that you can initialize t1 so that it represents all of the times in sc_times in one object. Then when that's used as the argument for the at() method the observe() method will return a vector of positions.

I'm currently working on vectorizing the computation of KeplerOrbit positions so that you can ask for the position of many objects at many times all at once, which should greatly speed up the kind of computation you're talking about doing. There's more discussion of this in issue #490.

nkelhos commented 3 years ago

I tried your suggestion:

ts     = load.timescale()
t1     = ts.from_astropy( sc_times )

but I get an error:

Traceback (most recent call last):
  File "./test_skyfield_deconstruct.py", line 31, in <module>
    t1     = ts.from_astropy( sc_times )
  File "/home/nkelhos/.local/lib/python3.7/site-packages/skyfield/timelib.py", line 305, in from_astropy
    return self.tt(jd=t.tt.jd)
AttributeError: 'list' object has no attribute 'tt'
nkelhos commented 3 years ago

To aid debugging, I condensed the code into this stand-alone script that recreates the error:

import ephem
from astropy.time import Time, TimeDelta                    
from skyfield.api import load 

sc_times = []
tdelta  = TimeDelta(2000,format='sec')                      
tstart  = Time('2015-01-01 00:00:00')
for i in range(10) :
  sc_times += [ tstart + (i*tdelta) ]

ts = load.timescale()
t1 = ts.from_astropy( sc_times ) 

output:

Traceback (most recent call last):
  File "pyephem_debug.py", line 12, in <module>
    t1 = ts.from_astropy( sc_times )
  File "/home/nkelhos/.local/lib/python3.7/site-packages/skyfield/timelib.py", line 305, in from_astropy
    return self.tt(jd=t.tt.jd)
AttributeError: 'list' object has no attribute 'tt'
JoshPaterson commented 3 years ago

That is happening because the from_astropy method is expecting an astropy.time.Time object, but the script is giving it a list of astropy.time.Time objects. Instead of a list of astropy.time.Time objects that each represent a single instant, you can make a single astropy.time.Time object that represents multiple instants like this:

from astropy.time import Time, TimeDelta
from skyfield.api import load
from numpy import arange

tdelta  = TimeDelta(2000*arange(10),format='sec')
tstart  = Time('2015-01-01 00:00:00')
sc_times = tstart + tdelta

ts = load.timescale()
t1 = ts.from_astropy(sc_times)

I don't have much experience with Astropy so there may be better ways to do that. If you wanted to create a Skyfield Time object directly without involving Astropy you could do it like this:

from skyfield.api import load
from numpy import arange
ts = load.timescale()
t1 = ts.utc(2015, second=2000*arange(10))

If you give a Skyfield Time object that contains an array of multiple times to the observe method in your script the objects that are output will contain arrays of positions. However, I'm not sure you'll be able to take full advantage of this because you are creating a different Topos object for each observation time. Would it be possible for you to create an EarthSatellite object to represent the satellite?

JoshPaterson commented 3 years ago

Here's an example script that observes Eunomia at 10,000 different times from the Hubble Space Telescope. On my computer (an older laptop) it takes about 2.5 seconds

import io
from skyfield.api import Loader, EarthSatellite
from skyfield.data.mpc import load_mpcorb_dataframe, mpcorb_orbit
from skyfield.constants import GM_SUN_Pitjeva_2005_km3_s2 as GM_SUN
load = Loader('~/projects/skyfield-data')
from numpy import linspace

ts = load.timescale()

eph = load('de430.bsp')
sun = eph['sun']
earth = eph['earth']

line = '00015    5.2   0.15 K20CH  60.84584   98.61793  292.93525   11.75338  0.1863457  0.22921812   2.6442555  0 MPO530953  2394  79 1851-2020 0.55 M-v 38h MPCW       0000     (15) Eunomia            20200107'
stream = io.BytesIO(line.encode('utf-8'))
row = load_mpcorb_dataframe(stream).iloc[0]
eunomia = sun + mpcorb_orbit(row, ts, GM_SUN)

line1 = '1 20580U 90037B   20339.56385875  .00000904  00000-0  43966-4 0  9993'
line2 = '2 20580  28.4701 274.4706 0002897 122.3805  40.5875 15.09500437481787'
hubble = earth + EarthSatellite(line1, line2, 'HST', ts)

num_times = 1e4
t = ts.utc(linspace(2015, 2016, num_times))

ra, dec, dist = hubble.at(t).observe(eunomia).radec()
print(ra.radians.shape)
JoshPaterson commented 3 years ago

As you said, it would be better if it was also vectorized for multiple different asteroids; that's probably necessary for the project you mentioned. I'm trying to make that work now and hopefully will be successful!

nkelhos commented 3 years ago

I'd agree with you that loading my satellite via its TLE (rather than its lat/lon/alt in my first code snippet) would run faster, but those TLEs make me nervous.

My goal is to search through 10 years of satellite data (from 2010 to now), and from what I understand, satellite TLE's get more inaccurate the further you get from the TLE's epoch (1-2km per day) . Going backwards 10 years, the error on my satellite's TLE-derived position would grow to nearly the radius of the earth, which would introduce parallax errors in the apparent RA/Dec of asteroids. That parallax error would be stronger the closer the asteroid passed by earth, and would equal all my other errors put together when the asteroid is at about 20 lunar orbits (7.7e9 km), so I'm still pretty interested in a vectorized lat/lon/alt method over the TLE method, but I understand that it'd take more time and effort to develop, so don't worry about it too much.

I may be able to use the earth barycenter as the observer position for now, and remove the parallax later in the analysis, but I'd have to think about it some more.

craigim commented 3 years ago

Most satellite TLEs are published at least once per week. So if you're searching through 10 years of satellite data on, say, https://space-track.org, you can pull all TLEs for a particular satellite for the past 10 years (be sure to read the terms of service before pulling that much data). Each TLE has an epoch date around which it's good for at least a couple weeks. Depending on how precise you need, you could pick one TLE per month and do your calculation over 120 1-month windows using a contemporary TLE. Not as good as doing it all at one go, but certainly better than doing it for each individual time point.

JoshPaterson commented 3 years ago

I think that if you already have the position data for the satellite you could initialize a Geocentric position object directly. You could then use that object's observe method with a vector of times and get back a vector of positions. Brandon will be back soon and he'll know for sure what the best approach is.

brandon-rhodes commented 3 years ago

Yes, I'm back at the keyboard now! So here’s an initial thought: is there any chance you could express the satellite’s position in the fixed GCRS reference frame? One very big expense of your current code is that the satellite positions are expressed relative to the Earth, so that Skyfield has to apply nutation, precession, and the frame bias before it knows the positions relative to the GCRS. And nutation in particular is a very, very expensive operation.

nkelhos commented 3 years ago

If you give a Skyfield Time object that contains an array of multiple times to the observe method in your script

I did this, and the processing time is now around 0.1 seconds per coordinate (down from 0.15s), though thats still pretty large. I've been comparing other software for speed and accuracy. The JPL Horizons package is about the same (0.09s per coordinate), though it requires submitting a network request to their server, and it can't handle a moving lat/lon/alt satellite very well. I also tried pyephem, which runs at 0.0003s per coordinate (!), but I had to hack together a formatted orbit line from MPCORB.DAT into the Xephem format. Even so, its still off by ~3deg in both RA and Dec, so something's still off.

you can pull all TLEs for a particular satellite for the past 10 years

I'd like to use TLEs, but without a better understanding of how inaccurate they are (e.g. "RaDec error is X deg per month away from epoch"), I'm trying to limit their use. I'm already having to use a TLE for the asteroid's orbit, which I'm going to have to hunt down the ephemerides for to at least get an accurate date range for each. Maybe pulling my satellite's TLE from spacetrack and comparing that to its reported lat/lon/alt would be a way to quantify its accuracy.

is there any chance you could express the satellite’s position in the fixed GCRS reference frame?

I'm not sure how to do that. The satellite data I'm looking through also stores the satellite's latitude, longitude, and altitude in its datafile, so that's what I've been loading into skyfield. Topo() seemed like the only way to get a lat/lon/alt coordinate into skyfield, but if there's another class that handles lat/lon/alt coords, I'd be interested in it.

I've also noticed there's still a small difference between JPL Horizons coordinates and Skyfield. For one satellite position and time, the asteroid coordinates computed with skyfield and jpl horizons are shown below. The JPL Horizons just uses the asteroid number, while skyfield uses the orbit line as found in MPCORB.DAT). The difference is around 0.1-0.3deg in each RA and Dec. I'm thinking JPL is using older orbital parameters, but are there any other potential causes?

satellite lat/lon/alt : (-25.410931 deg, 29.656080 deg, 567703.958968 m)
time: 56155.035067293094 mjd

##### asteroid 15789 #####
orbline: MPCORB     : '15789    7.0   0.15 K20CH  71.81889  317.41340  354.50668    5.14719  0.1888393  0.00395393  39.6080302  2 MPO355237    91   9 1993-2009 0.57 M-v 38h MPCLINUX   400A  (15789) 1993 SC            20090918\n'
radec  : skyfield   :  30.436597  15.784646 deg
radec  : jplhoriz   :  30.632030  15.849120 deg

##### asteroid 37117 #####
orbline: MPCORB     : '37117   13.29  0.15 K205V  27.52717  180.12088  248.51029   13.80422  0.5535177  0.05464455   6.8776193  0 E2020-QM5   589   9 1984-2020 0.45 M-v 3Ek Pan        000A  (37117) Narcissus          20200425\n'
radec  : skyfield   : 261.330658 -18.524061 deg
radec  : jplhoriz   : 261.691120 -18.524730 deg

##### asteroid 38083 #####
orbline: MPCORB     : '38083    7.00  0.15 K205V 106.05522   80.18760   10.03697   12.76549  0.1505610  0.00404266  39.0263279  2 E2020-QM5    79  16 1999-2020 0.73 M-v 3Ek Pan        000A  (38083) Rhadamanthus       20200421\n'
radec  : skyfield   : 197.518377 -10.115604 deg
radec  : jplhoriz   : 197.672650 -10.180090 deg

##### asteroid 47171 #####
orbline: MPCORB     : '47171    4.99  0.15 K205V   7.80659  294.39491   97.00953    8.42447  0.2245572  0.00398722  39.3872414  1 E2020-QM5   489  23 1974-2020 0.46 M-v 3Ek Pan        000A  (47171) Lempo              20200722\n'
radec  : skyfield   :  28.517668   3.047685 deg
radec  : jplhoriz   :  28.696180   3.120970 deg

##### asteroid 47932 #####
orbline: MPCORB     : '47932    6.2   0.15 K20CH  15.32361  195.94916   26.14455   10.81976  0.2845255  0.00395646  39.5911382  0 MPO385122   141  15 1989-2016 0.38 M-v 38h MPCLINUX   000A  (47932) 2000 GN171         20160629\n'
radec  : skyfield   : 222.124140 -20.432726 deg
radec  : jplhoriz   : 222.293780 -20.486060 deg

##### asteroid 52747 #####
orbline: MPCORB     : '52747    7.9   0.15 K20CH 322.82882  245.84748   64.12260    0.54970  0.0659631  0.00329500  44.7265483  3 MPO 66038    22   6 1998-2004 0.23 M-v 38h MPCW       400A  (52747) 1998 HM151         20040527\n'
radec  : skyfield   : 254.403302 -22.788325 deg
radec  : jplhoriz   : 254.612570 -22.808560 deg
brandon-rhodes commented 3 years ago

@nkelhos — I have time in the next few days to look at your script for performance ideas, but it sounds like you have improved it (with a time vector) since last sharing an example. Could you share a stand-alone script that reads a tiny list of lat/lon/altitude values (that you can just include inline in a StringIO if you don't want to provide a text file) and produces the output you're looking for? That will make sure I'm clear on what the current version of your script does, and will also help compare against HORIZONS. Thanks!

nkelhos commented 3 years ago

I wrote a standalone script, though its a little long. The text lines for the orbital parameters and the satellite coordinates are directly written into the script. It needs pyephem, skyfield, astroquery, and astropy to run. It still requires access to the file de430.bsp, whose path is defined at the top of the script.

I added coordinate calculation for pyephem, with the 'correct' Soft03Distant.txt orbital line (pyephem-a), and a line from converting the asteroid's MPCORB.DAT line into the xephem format (pyephem-b). The epochs of pyephem-a and -b are 20 years different though, which I think is the source of the large difference in their coordinates.

Be aware the times used are slightly different (by 0.8 seconds) between jplhorizions and the others, due to jplhorizons only accepting a 'YYYY-MM-DD HH:MM:SS' string for the observation time, so that's why they are printed. I don't think that's the cause of differences (these asteroids tend to move ~0.2deg/day (for 0.8 sec = 0.00000185 deg of movement), but I print it anyway.

The standalone script's output:

$ ./coord_compare_github.py 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Warning, running this script makes 30 network calls
to the JPL Horizons server (6 asteroids * 5 different
satellite times), so please run responsibly...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

satellite:
  times: [56155.03506729 56155.03541417 56155.03576139 56155.03610861
 56155.03645583] mjd
  alts: [567703.958968216, 567786.2034901858, 567830.5178270738, 567837.8134326029, 567804.8521373646] meters
  lats: [-25.41093, -25.549461, -25.658817, -25.738668, -25.788858] deg
  lons: [29.65608, 31.600145, 33.550415, 35.503963, 37.459793] deg

MPCORB.DAT lines are from https://www.minorplanetcenter.net/iau/MPCORB.html (MPCORB.DAT (uncompressed))
mpc2xephem is from converting MPCORB.DAT line to Soft03's format via mpcorb2xephem()
Soft03 lines are from https://minorplanetcenter.net/iau/Ephemerides/Comets/SoftwareComets.html (xephem (E. Downey) link)

##########################
##### asteroid 15789 #####
##########################
orbit lines:
  MPCORB.DAT: '15789    7.0   0.15 K20CH  71.81889  317.41340  354.50668    5.14719  0.1888393  0.00395393  39.6080302  2 MPO355237    91   9 1993-2009 0.57 M-v 38h MPCLINUX   400A  (15789) 1993 SC            20090918\n'
  mpc2xephem: '15789 1993 SC,e,5.14719,354.50668,317.41340,39.6080302,0.00395393,0.1888393,71.81889,04/27.0/2019,2020.959016,H 7.0,0.15'
  Soft03    : '15789 1993 SC,e,5.1547,354.6048,316.0865,39.42895,0.0039809,0.18750038,70.7780,04/27.0/2019,2000,H 7.0,0.15\n'

calculated coordinates:
              ra____________  dec___________  time______________
  pyephem-a :  30.638599 deg   15.854483 deg  0.000360 sec/coord (using untouched line from Soft03Distant.txt, epoch=2000)
  pyephem-b :  32.854101 deg   16.811836 deg  0.000326 sec/coord (using reformatted MPC line via mpcorb2xephem(), epoch=2020.959016)
  skyfield  :  30.436593 deg   15.784646 deg  0.135080 sec/coord
  jplhoriz  :  30.632030 deg   15.849120 deg  0.079683 sec/coord

##########################
##### asteroid 37117 #####
##########################
orbit lines:
  MPCORB.DAT: '37117   13.29  0.15 K205V  27.52717  180.12088  248.51029   13.80422  0.5535177  0.05464455   6.8776193  0 E2020-QM5   589   9 1984-2020 0.45 M-v 3Ek Pan        000A  (37117) Narcissus          20200425\n'
  mpc2xephem: '37117 Narcissus,e,13.80422,248.51029,180.12088,6.8776193,0.05464455,0.5535177,27.52717,04/27.0/2019,2020.412568,H13.29,0.15'
  Soft03    : '37117 Narcissus,e,13.8050,248.5169,180.1662,6.881308,0.0546006,0.55361445,5.6541,04/27.0/2019,2000,H13.3,0.15\n'

calculated coordinates:
              ra____________  dec___________  time______________
  pyephem-a : 261.614774 deg  -18.519557 deg  0.000341 sec/coord (using untouched line from Soft03Distant.txt, epoch=2000)
  pyephem-b : 270.779974 deg  -16.518429 deg  0.000373 sec/coord (using reformatted MPC line via mpcorb2xephem(), epoch=2020.412568)
  skyfield  : 261.330636 deg  -18.524061 deg  0.148560 sec/coord
  jplhoriz  : 261.691120 deg  -18.524730 deg  0.075409 sec/coord

##########################
##### asteroid 38083 #####
##########################
orbit lines:
  MPCORB.DAT: '38083    7.00  0.15 K205V 106.05522   80.18760   10.03697   12.76549  0.1505610  0.00404266  39.0263279  2 E2020-QM5    79  16 1999-2020 0.73 M-v 3Ek Pan        000A  (38083) Rhadamanthus       20200421\n'
  mpc2xephem: '38083 Rhadamanthus,e,12.76549,10.03697,80.18760,39.0263279,0.00404266,0.1505610,106.05522,04/27.0/2019,2020.412568,H 7.00,0.15'
  Soft03    : '38083 Rhadamanthus,e,12.7453,10.0024,81.2732,39.13194,0.0040263,0.15000828,103.3118,04/27.0/2019,2000,H 7.1,0.15\n'

calculated coordinates:
              ra____________  dec___________  time______________
  pyephem-a : 197.674036 deg  -10.178298 deg  0.000354 sec/coord (using untouched line from Soft03Distant.txt, epoch=2000)
  pyephem-b : 198.657288 deg  -10.916768 deg  0.000331 sec/coord (using reformatted MPC line via mpcorb2xephem(), epoch=2020.412568)
  skyfield  : 197.518389 deg  -10.115609 deg  0.107084 sec/coord
  jplhoriz  : 197.672650 deg  -10.180090 deg  0.077389 sec/coord

##########################
##### asteroid 47171 #####
##########################
orbit lines:
  MPCORB.DAT: '47171    4.99  0.15 K205V   7.80659  294.39491   97.00953    8.42447  0.2245572  0.00398722  39.3872414  1 E2020-QM5   489  23 1974-2020 0.46 M-v 3Ek Pan        000A  (47171) Lempo              20200722\n'
  mpc2xephem: '47171 Lempo,e,8.42447,97.00953,294.39491,39.3872414,0.00398722,0.2245572,7.80659,04/27.0/2019,2020.412568,H 4.99,0.15'
  Soft03    : '47171 Lempo,e,8.4241,97.0130,294.5150,39.26610,0.0040057,0.22207705,6.1715,04/27.0/2019,2000,H 4.8,0.15\n'

calculated coordinates:
              ra____________  dec___________  time______________
  pyephem-a :  28.703174 deg    3.118370 deg  0.000359 sec/coord (using untouched line from Soft03Distant.txt, epoch=2000)
  pyephem-b :  30.822400 deg    4.053874 deg  0.000334 sec/coord (using reformatted MPC line via mpcorb2xephem(), epoch=2020.412568)
  skyfield  :  28.517663 deg    3.047681 deg  0.116057 sec/coord
  jplhoriz  :  28.696180 deg    3.120970 deg  0.075797 sec/coord

##########################
##### asteroid 47932 #####
##########################
orbit lines:
  MPCORB.DAT: '47932    6.2   0.15 K20CH  15.32361  195.94916   26.14455   10.81976  0.2845255  0.00395646  39.5911382  0 MPO385122   141  15 1989-2016 0.38 M-v 38h MPCLINUX   000A  (47932) 2000 GN171         20160629\n'
  mpc2xephem: '47932 2000 GN171,e,10.81976,26.14455,195.94916,39.5911382,0.00395646,0.2845255,15.32361,04/27.0/2019,2020.959016,H 6.2,0.15'
  Soft03    : '47932 2000 GN171,e,10.8058,26.0775,195.8557,39.74833,0.0039330,0.28769257,12.9358,04/27.0/2019,2000,H 6.2,0.15\n'

calculated coordinates:
              ra____________  dec___________  time______________
  pyephem-a : 222.276970 deg  -20.477395 deg  0.000354 sec/coord (using untouched line from Soft03Distant.txt, epoch=2000)
  pyephem-b : 226.250200 deg  -22.377862 deg  0.000329 sec/coord (using reformatted MPC line via mpcorb2xephem(), epoch=2020.959016)
  skyfield  : 222.124150 deg  -20.432728 deg  0.109948 sec/coord
  jplhoriz  : 222.293780 deg  -20.486060 deg  0.077447 sec/coord

##########################
##### asteroid 52747 #####
##########################
orbit lines:
  MPCORB.DAT: '52747    7.9   0.15 K20CH 322.82882  245.84748   64.12260    0.54970  0.0659631  0.00329500  44.7265483  3 MPO 66038    22   6 1998-2004 0.23 M-v 38h MPCW       400A  (52747) 1998 HM151         20040527\n'
  mpc2xephem: '52747 1998 HM151,e,0.54970,64.12260,245.84748,44.7265483,0.00329500,0.0659631,322.82882,04/27.0/2019,2020.959016,H 7.9,0.15'
  Soft03    : '52747 1998 HM151,e,0.5489,64.0892,243.4820,44.78110,0.0032890,0.06532828,322.9748,04/27.0/2019,2000,H 7.9,0.15\n'

calculated coordinates:
              ra____________  dec___________  time______________
  pyephem-a : 254.586361 deg  -22.805275 deg  0.000356 sec/coord (using untouched line from Soft03Distant.txt, epoch=2000)
  pyephem-b : 256.624870 deg  -23.009517 deg  0.000328 sec/coord (using reformatted MPC line via mpcorb2xephem(), epoch=2020.959016)
  skyfield  : 254.403298 deg  -22.788324 deg  0.108052 sec/coord
  jplhoriz  : 254.612570 deg  -22.808560 deg  0.076155 sec/coord

all displayed coordinates used this time:
  pyephem & skyfield time : 56155.03506729     mjd
  jpl horizons time       : 56155.035057870205 mjd

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Warning, running this script makes 30 network calls
to the JPL Horizons server (6 asteroids * 5 different
satellite times), so please run responsibly...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Pyephem-a coordinates seems closest (~0.05deg) to jplhoriz, followed by skyfield (~0.2deg off), while pyephem-b is off by around ~2 deg, likely due to the large epoch difference.

brandon-rhodes commented 3 years ago

@nkelhos — Thanks for the example script! Once Christmas is over, I will have the chance to try it out and see how its performance looks.

brandon-rhodes commented 3 years ago

@nkelhos — Your script's biggest problem is that it re-creates its Time objects, forcing them to recompute their nutation for each iteration. Here is an edited script that instead pre-builds several objects:

There is more you could do — for example, pre-creating the fermi objects so that they are only built and computed once. That should make things even faster.

But hopefully this example shows you how to build an object ahead of time. With the time objects pre-built, the additional computations are about 3× faster on my laptop.

#!/usr/bin/env python
import time, io
from astropy.time import Time

from skyfield.api       import load
from skyfield.data.mpc  import load_mpcorb_dataframe
from skyfield.data.mpc  import mpcorb_orbit
from skyfield.constants import GM_SUN_Pitjeva_2005_km3_s2 as GM_SUN
from skyfield.toposlib  import Topos

# from astroquery.jplhorizons import Horizons

#bspfile = 'de430.bsp'
bspfile = 'de430t.bsp'

ts = load.timescale()
eph = load(bspfile)
sun = eph['sun']
earth = eph['earth']

def asteroid_radec_skyfield( mpcline, times, t1_list, lats, lons, alts ):
  aster   = int(mpcline.split()[0])
  a_line  = io.BytesIO( mpcline.encode('utf-8') )
  a_info  = load_mpcorb_dataframe( a_line )
  a_info  = a_info.set_index('designation_packed')
  a_info  = a_info.loc['%05d'%aster]
  a_pos   = sun + mpcorb_orbit( a_info, ts, GM_SUN )
  n       = len(times)
  ras     = [0]*n
  des     = [0]*n
  dis     = [0]*n
  ta      = time.time()
  for i in range(n) :
    fermi  = earth + Topos( latitude_degrees=lats[i], longitude_degrees=lons[i], elevation_m=alts[i] )
    obs    = fermi.at(t1_list[i]).observe(a_pos)
    rdd    = obs.apparent().radec()
    ras[i] = rdd[0]._degrees
    des[i] = rdd[1].degrees
    dis[i] = rdd[2].m
  t2 = time.time()

  rt = (t2-ta)/n
  return ras, des, dis, rt

# satellite coordinates from satellite datafile
times = [56155.03506729, 56155.03541417, 56155.03576139, 56155.03610861, 56155.03645583] # mjd
alts  = [567703.958968216, 567786.2034901858, 567830.5178270738, 567837.8134326029, 567804.8521373646] # meters
lats  = [-25.41093, -25.549461, -25.658817, -25.738668, -25.788858] # deg
lons  = [29.65608, 31.600145, 33.550415, 35.503963, 37.459793] # deg
times = Time( times, format='mjd' )
print('satellite:')
print('  times:', times.mjd, 'mjd')
print('  alts:' , alts, 'meters')
print('  lats:' , lats, 'deg' )
print('  lons:' , lons, 'deg' )

t1 = ts.from_astropy(times)
t1_list = list(t1)

# source asteroid orbit lines
print()
print('MPCORB.DAT lines are from https://www.minorplanetcenter.net/iau/MPCORB.html (MPCORB.DAT (uncompressed))')
print('mpc2xephem is from converting MPCORB.DAT line to Soft03\'s format via mpcorb2xephem()')
print('Soft03 lines are from https://minorplanetcenter.net/iau/Ephemerides/Comets/SoftwareComets.html (xephem (E. Downey) link)')
mpclines = {}
mpclines[15789] = '15789    7.0   0.15 K20CH  71.81889  317.41340  354.50668    5.14719  0.1888393  0.00395393  39.6080302  2 MPO355237    91   9 1993-2009 0.57 M-v 38h MPCLINUX   400A  (15789) 1993 SC            20090918\n'
mpclines[37117] = '37117   13.29  0.15 K205V  27.52717  180.12088  248.51029   13.80422  0.5535177  0.05464455   6.8776193  0 E2020-QM5   589   9 1984-2020 0.45 M-v 3Ek Pan        000A  (37117) Narcissus          20200425\n'
mpclines[38083] = '38083    7.00  0.15 K205V 106.05522   80.18760   10.03697   12.76549  0.1505610  0.00404266  39.0263279  2 E2020-QM5    79  16 1999-2020 0.73 M-v 3Ek Pan        000A  (38083) Rhadamanthus       20200421\n'
mpclines[47171] = '47171    4.99  0.15 K205V   7.80659  294.39491   97.00953    8.42447  0.2245572  0.00398722  39.3872414  1 E2020-QM5   489  23 1974-2020 0.46 M-v 3Ek Pan        000A  (47171) Lempo              20200722\n'
mpclines[47932] = '47932    6.2   0.15 K20CH  15.32361  195.94916   26.14455   10.81976  0.2845255  0.00395646  39.5911382  0 MPO385122   141  15 1989-2016 0.38 M-v 38h MPCLINUX   000A  (47932) 2000 GN171         20160629\n'
mpclines[52747] = '52747    7.9   0.15 K20CH 322.82882  245.84748   64.12260    0.54970  0.0659631  0.00329500  44.7265483  3 MPO 66038    22   6 1998-2004 0.23 M-v 38h MPCW       400A  (52747) 1998 HM151         20040527\n'

soft03s = {}
soft03s[15789] = '15789 1993 SC,e,5.1547,354.6048,316.0865,39.42895,0.0039809,0.18750038,70.7780,04/27.0/2019,2000,H 7.0,0.15\n'
soft03s[37117] = '37117 Narcissus,e,13.8050,248.5169,180.1662,6.881308,0.0546006,0.55361445,5.6541,04/27.0/2019,2000,H13.3,0.15\n'
soft03s[38083] = '38083 Rhadamanthus,e,12.7453,10.0024,81.2732,39.13194,0.0040263,0.15000828,103.3118,04/27.0/2019,2000,H 7.1,0.15\n'
soft03s[47171] = '47171 Lempo,e,8.4241,97.0130,294.5150,39.26610,0.0040057,0.22207705,6.1715,04/27.0/2019,2000,H 4.8,0.15\n'
soft03s[47932] = '47932 2000 GN171,e,10.8058,26.0775,195.8557,39.74833,0.0039330,0.28769257,12.9358,04/27.0/2019,2000,H 6.2,0.15\n'
soft03s[52747] = '52747 1998 HM151,e,0.5489,64.0892,243.4820,44.78110,0.0032890,0.06532828,322.9748,04/27.0/2019,2000,H 7.9,0.15\n'

for asteroid in [15789, 37117, 38083, 47171, 47932, 52747] :

  print()
  print()
  print('##########################')
  print('##### asteroid %d #####' % asteroid)
  print('##########################')
  print('orbit lines:')

  mpcline    = mpclines[asteroid]
  soft03line = soft03s[asteroid]

  print('  MPCORB.DAT: %s' % repr( mpcline    ) )
  print('  Soft03    : %s' % repr( soft03line ) )

  sf_ras, sf_des, sf_dis, sf_rt = asteroid_radec_skyfield( mpcline, times, t1_list, lats, lons, alts)
  print()
  print('calculated coordinates:')
  print( '              ra____________  dec___________  time______________')
  print( '  skyfield  : %10.6f deg  %10.6f deg  %f sec/coord' % ( sf_ras[0], sf_des[0], sf_rt ) )
nkelhos commented 3 years ago

Hrm, I'm not sure whats going on. On my system the runtimes are all the same for all 3 scripts (slow:my slow uncorrected script, fast-nkelhos: my faster script with your suggested corrections, and fast-brhodes: the script you posted above). The times also all vary by about 5 %. I'm running this stuff on a virtual machine on my desktop, though I don't think that has a big influence, since it could run pyephem so fast.

asteroid - runtime

15789 - 0.1418s slow
        0.1354s fast-nkelhos
        0.1421s fast-bhrodes

37117 - 0.1472s slow
        0.1471s fast-nkelhos
        0.1520s fast-brhodes

38083 - 0.1067s slow
        0.1072s fast-nkelhos
        0.1124s fast-brhodes

47171 - 0.1075s slow
        0.1215s fast-nkelhos
        0.1063s fast-brhodes

47932 - 0.1051s slow
        0.1066s fast-nkelhos
        0.1113s fast-brhodes

52757 - 0.1137s slow
        0.1144s fast-nkelhos
        0.1134s fast-brhodes
brandon-rhodes commented 3 years ago

How are you computing the runtimes?

nkelhos commented 3 years ago

Its the same way in all 3 scripts, do all the pre-loading, start the timer, loop over each coordinate calculation, stop the timer, calculate and print the average time per loop:

  ta      = time.time()
  for i in range(n) :
    fermi  = earth + Topos( latitude_degrees=lats[i], longitude_degrees=lons[i], elevation_m=alts[i] )
    obs    = fermi.at(t1_list[i]).observe(a_pos)
    rdd    = obs.apparent().radec()
    ras[i] = rdd[0]._degrees
    des[i] = rdd[1].degrees
    dis[i] = rdd[2].m
  t2 = time.time()

  rt = (t2-ta)/n # average time per loop

I'm planning on lots of coordinate conversions (20,000 coordinates per batch, many batches), so the per-coordinate time is more significant than the pre-loading time.

brandon-rhodes commented 3 years ago

A question is: how big at the moment is n? The first time through the loop will be the expensive one, then the further iterations should be less expensive.

nkelhos commented 3 years ago

For each asteroid, n ~ 5, though I keep it low to not hammer the jpl horizons server with network calls. I reran the scripts with the timer excluding the first loop but the average time per loop is still ~= 0.1s per coord.

Its not super important to optimize this script further, I think. I would be interested in the vectorization that were mentioned in #526 . Is the timescale for that something like ~weeks?

brandon-rhodes commented 3 years ago

You might want to write the HORIZONS out to a file and then load from a file for your tests — or, if it's the time you are testing for, saving the number of seconds to a file. I would encourage you not to submit repeated requests to HORIZONS.

Re-reading your code: the speedup only works the second, third, fourth, etc time that a given ti_list[i] time gets used. The first time is slow; then the additional uses of that time get accelerated by its cached nutation matrix. So if your loop only uses each time once, then that explains why there is no speedup. My sample code, by contrast, used each time in several different computations.

Its not super important to optimize this script further, I think. I would be interested in the vectorization that were mentioned in #526 . Is the timescale for that something like ~weeks?

My guess would be that it will take at least a couple of months for that work to complete, as big decisions about Skyfield architecture need to be complete, and as everyone involved are volunteers. My own hope is that if we can design an approach that lets NumPy broadcasting work the way NumPy is designed, that we can have a working solution without too much work; but there are other proposals that involve more substantial rewriting, and if we needed to take one of those approaches then it might take more time.

brandon-rhodes commented 3 years ago

Since the underlying request behind this issue is now identical to #532, I am going to close this issue so I don't wind up tracking two issues about the same feature, and also to try to keep future discussion centralized in one place (as otherwise I sometimes have trouble tracking down on which issue an earlier comment was made). Hopefully the next few months will see progress!