Closed Bernmeister closed 3 years ago
@Bernmeister — You have run into a subtle behavior of Pandas. If .loc[name]
asks about a name for which only one row exists, you get back that single row as a Series
. But if the name exists several times in the file, you get back a DataFrame
with all n of the matching rows! Yes, a grand source of bugs: depending on the data, it can return either of two different types, so that the code that follows must be prepared for two different data types.
I confirmed the problem by adding print
statements until the behavior was clear. Here's what I did to your loop, in case you need to do further investigations like this in the future:
for name, row in dataframe.iterrows():
print(row)
print(row.eccentricity)
print(dataframe.loc[ name ])
print(dataframe.loc[ name ].eccentricity)
if dataframe.loc[ name ].eccentricity == 1.0:
print('e == 1.0')
Catching the exception, by the way, was hiding the traceback, which is necessary to seeing where the code is failing; so as my first step I removed the try…except
maneuver to learn more about what was going on.
I would suggest simply using the row
value you already have instead of indexing back into the dataframe with .loc[]
, which goes to the expense of building the row object all over again.
Try .comet_orbit(row)
and see if your script's behavior improves!
Well that worked! Still trying to get my head around numpy (and frankly only heard about it once I started using Skyfield). Thanks again!
When running the code below, I get the following exception on a bunch of comets:
The code can either load the comet data from the Minor Planet Center OR use a locally saved copy (to avoid hammering the MPC server).
The test code:
I have also attached the data file (in case there is something odd in today's data and it disappears tomorrow).
Soft00Cmt.txt