skyfielders / python-skyfield

Elegant astronomy for Python
MIT License
1.38k stars 208 forks source link

Error in mpc.mpcorb_orbit #861

Closed Errabundo66 closed 12 months ago

Errabundo66 commented 1 year ago

Hello. It's my first attempt, and I'm an absolute beginner in python. I made a code for extracting some asteroid ephemeris sending a query to JPL Horizons and modifying an excel file. It was hard, but I succeeded.

But it needs to be connected to internet at every moment.

So I discovered your Skyfield suit and it's fantastic. My goal is downloading MPCORB.dat file, and working with it offline.

I tried successfully using Pandas dkl file to save the MPCORB.dat file in a Pandas format. In future code runs, the file is charged in a moment. I want to operate in the whole file, so I prefer not doing and excerpt.

I began just understanding the main principles, but I got an error. Here is my code and the error I got.

(the row variable is OK, showing all asteroid elements, etc...)

from skyfield.api import Loader, Topos
from skyfield.api import load, wgs84
from skyfield.data import mpc
from skyfield.almanac import find_discrete
import pandas as pd
from skyfield.constants import GM_SUN_Pitjeva_2005_km3_s2 as GM_SUN

# Cargar los datos de los planetas y la Tierra
#load = Loader('~/skyfield-data')
eph = load('de421.bsp')

# Definir la ubicación del observador en la Tierra (por ejemplo, Madrid)
observatory = wgs84.latlon(40.4168, -3.7038)

# Cargar el archivo MPCORB.dat
mpcorb = load.open('MPCORB.DAT')
print ("load hecho")

with mpcorb as f:
    minor_planets = mpc.load_mpcorb_dataframe(f)

# Guardar el DataFrame en un archivo binario
#minor_planets.to_pickle('data.pkl')

# Cargar el DataFrame desde el archivo binario
#minor_planets = pd.read_pickle('data.pkl')

print(minor_planets.shape[0], 'minor planets loaded')

# Index by designation for fast lookup.
minor_planets = minor_planets.set_index('designation', drop=False)
print ("index hecho")
# Sample lookups.
row = minor_planets.loc['(1234) Elyna']
print (row)

# Generating a position.

ts = load.timescale()
t = ts.utc(2020, 5, 31)

sun, earth = eph['sun'], eph['earth']

asteroid = sun + mpc.mpcorb_orbit(row, ts, GM_SUN)

ra, dec, distance = earth.at(t).observe(asteroid).radec()
print(ra)
print(dec)

I always got an error:

  File "/Users/fhuet/Library/CloudStorage/OneDrive-Personal/Python/Colab Notebooks copia/from skyfield.py", line 46, in <module>
    asteroid = sun + mpc.mpcorb_orbit(row, ts, GM_SUN)
  File "/Users/fhuet/opt/anaconda3/lib/python3.9/site-packages/skyfield/data/mpc.py", line 83, in mpcorb_orbit
    p = a * (1.0 - e*e)
TypeError: can't multiply sequence by non-int of type 'str'

What am I doing wrong? Thank you!

brandon-rhodes commented 1 year ago

When the documentation says the file needs to be "shorn of its text header", it means that if you leave the header in place, then Skyfield will not be able to use the resulting dataframe, because it will have strings inside of it instead of numbers. Take a look at the first orbit you have loaded with your code:

print(minor_planets.iloc[0])

Result:

designation_packed                      MINOR P
magnitude_H                                ANET
magnitude_G                               ENTER
epoch_packed                              ORBIT
mean_anomaly_degrees                   DATABASE
argument_of_perihelion_degrees           PCORB)
longitude_of_ascending_node_degrees         NaN
inclination_degrees                         NaN
eccentricity                                NaN
mean_daily_motion_degrees                   NaN
semimajor_axis_au                           NaN
uncertainty                                 NaN
reference                                   NaN
observations                                NaN
oppositions                                 NaN
observation_period                          NaN
rms_residual_arcseconds                     NaN
coarse_perturbers                           NaN
precise_perturbers                          NaN
computer_name                               NaN
hex_flags                                   NaN
designation                                 NaN
last_observation_date                       NaN
Name: 0, dtype: object

And compare those orbital parameters with the text at the top of the file:

MINOR PLANET CENTER ORBIT DATABASE (MPCORB)

This file contains published orbital elements for all numbered and unnumbered
multi-opposition minor planets for which it is possible to make reasonable
predictions.  It also includes published elements for recent one-opposition
minor planets and is intended to be complete through the last issued Daily
Orbit Update MPEC.  As such it is intended to be of interest primarily
to astrometric observers.
...

As you can see, Pandas is loading the title from the first line as though it were a set of orbital elements. And because the first few lines are simply text, Pandas makes every column a generic text column, so a and e don't wind up as numbers for Python to work with by the time you reach the exception.

At the moment the header is exactly 43 lines long, so one approach would be to do this after opening the file:

for i in range(43):
    print(repr(mpcorb.readline()))

That should skip those first 43 lines, printing them out so you can double-check that they are indeed the text header. Then you can successfully load the dataframe (if you have more RAM than I do!).

I should take a look at improving the documentation, so that it both shows this maneuver about how to skip the header lines right inside of Python, and so that it emphasizes even more strongly that the un-edited, un-excerpted file is not a pure data file and cannot be loaded into a meaningful Pandas dataframe.

Errabundo66 commented 1 year ago

That's fantastic! Now it works! Thanks a lot!!

It seems after reading the first 43 lines, when creating the dataframe from mpcorb, it only uses from line 44 onwards!

After...

with mpcorb as f:
    minor_planets = mpc.load_mpcorb_dataframe(f)

... If I do

print(minor_planets.iloc[0])

I get:

designation_packed                         00001
magnitude_H                                 3.33
magnitude_G                                 0.15
epoch_packed                               K232P
mean_anomaly_degrees                    17.21569
argument_of_perihelion_degrees          73.47045
longitude_of_ascending_node_degrees     80.26013
inclination_degrees                     10.58635
eccentricity                            0.078817
mean_daily_motion_degrees               0.214115
semimajor_axis_au                       2.767182
uncertainty                                    0
reference                              E2023-F87
observations                              7283.0
oppositions                                  123
observation_period                     1801-2023
rms_residual_arcseconds                     0.65
coarse_perturbers                            M-v
precise_perturbers                           30k
computer_name                           MPCLINUX
hex_flags                                   0000
designation                            (1) Ceres
last_observation_date                   20230321
Name: 0, dtype: object

The final code is this one:

from skyfield.api import Loader, Topos
from skyfield.api import load, wgs84
from skyfield.data import mpc
from skyfield.almanac import find_discrete
import pandas as pd
from skyfield.constants import GM_SUN_Pitjeva_2005_km3_s2 as GM_SUN

# Cargar los datos de los planetas y la Tierra
#load = Loader('~/skyfield-data')
eph = load('de421.bsp')

# Definir la ubicación del observador en la Tierra (por ejemplo, Madrid)
observatory = wgs84.latlon(40.4168, -3.7038)

# Cargar el archivo MPCORB.dat
mpcorb = load.open('MPCORB.DAT')

for i in range(43):
    repr(mpcorb.readline())

with mpcorb as f:
    minor_planets = mpc.load_mpcorb_dataframe(f)

# Guardar el DataFrame en un archivo binario
#minor_planets.to_pickle('data.pkl')

# Cargar el DataFrame desde el archivo binario
#minor_planets = pd.read_pickle('data.pkl')

print(minor_planets.shape[0], 'minor planets loaded')

# Index by designation for fast lookup.
minor_planets = minor_planets.set_index('designation', drop=False)

# Sample lookups.
row = minor_planets.loc['(1234) Elyna']
print (row)

# Generating a position.

ts = load.timescale()
t = ts.utc(2020, 5, 31)

sun, earth = eph['sun'], eph['earth']

asteroid = sun + mpc.mpcorb_orbit(row, ts, GM_SUN)

ra, dec, distance = earth.at(t).observe(asteroid).radec()
print(ra)
print(dec)
brandon-rhodes commented 12 months ago

I'm glad your code is now working!