skyfielders / python-skyfield

Elegant astronomy for Python
MIT License
1.38k stars 208 forks source link

IndexError when printing DataFrame containing Time items #835

Closed gfairchild closed 1 year ago

gfairchild commented 1 year ago

I just noticed that when I include Time objects in a Pandas DataFrame, an IndexError is raised. Here's a minimum working example:

from skyfield.api import load
import pandas as pd

ts = load.timescale()
t1 = ts.utc(2020, 1, 1, 0, 0, 0)
print(t1)
t2 = ts.utc(2020, 2, 1, 0, 0, 0)
print(t2)

df = pd.DataFrame({'time': [t1, t2]})
print(df.info())
print(df)

Here's the output I get:

<Time tt=2458849.500800741>
<Time tt=2458880.500800741>
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   time    2 non-null      object
dtypes: object(1)
memory usage: 144.0+ bytes
None
Traceback (most recent call last):
  File "/Users/gfairchild/Desktop/test.py", line 12, in <module>
    print(df)
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/core/frame.py", line 1064, in __repr__
    return self.to_string(**repr_params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/core/frame.py", line 1245, in to_string
    return fmt.DataFrameRenderer(formatter).to_string(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/io/formats/format.py", line 1136, in to_string
    string = string_formatter.to_string()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/io/formats/string.py", line 30, in to_string
    text = self._get_string_representation()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/io/formats/string.py", line 45, in _get_string_representation
    strcols = self._get_strcols()
              ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/io/formats/string.py", line 36, in _get_strcols
    strcols = self.fmt.get_strcols()
              ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/io/formats/format.py", line 617, in get_strcols
    strcols = self._get_strcols_without_index()
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/io/formats/format.py", line 883, in _get_strcols_without_index
    fmt_values = self.format_col(i)
                 ^^^^^^^^^^^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/io/formats/format.py", line 897, in format_col
    return format_array(
           ^^^^^^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/io/formats/format.py", line 1328, in format_array
    return fmt_obj.get_result()
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/io/formats/format.py", line 1359, in get_result
    fmt_values = self._format_strings()
                 ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/io/formats/format.py", line 1422, in _format_strings
    fmt_values.append(f" {_format(v)}")
                          ^^^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/io/formats/format.py", line 1402, in _format
    return str(formatter(x))
               ^^^^^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/io/formats/printing.py", line 221, in pprint_thing
    elif is_sequence(thing) and _nest_lvl < get_option("display.pprint_nest_depth"):
         ^^^^^^^^^^^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/pandas/core/dtypes/inference.py", line 388, in is_sequence
    len(obj)  # Has a length associated with it.
    ^^^^^^^^
  File "/usr/local/Caskroom/miniconda/base/envs/test/lib/python3.11/site-packages/skyfield/timelib.py", line 420, in __len__
    return self.shape[0]
           ~~~~~~~~~~^^^
IndexError: tuple index out of range

The DataFrame does seem to get created properly (you'll see above that df.info() shows 2 items), but when I try to print it out (print(df)), the IndexError is raised. What's going on here?

Thanks!

brandon-rhodes commented 1 year ago

Interesting! Pandas does a series of operations like iter(t) and len(t) to try to figure out what sort of thing the time object is, and they specifically need to return TypeError if t is a scalar time for Pandas to understand what's going on. I've just landed a quick fix, which should appear with the next release. If you want to try it out now, you can:

pip install -U https://github.com/skyfielders/python-skyfield/archive/master.zip
gfairchild commented 1 year ago

Thanks so much!