grd349 / PBjam

A repo for our peak baggin code and tips on jam
MIT License
17 stars 6 forks source link

"Datatype coercion is not allowed" when creating session with custom timeseries array #262

Open warrickball opened 3 years ago

warrickball commented 3 years ago

Here's a script that creates a basic timeseries of Gaussian noise in a 2×1000 array.

#!/usr/bin/env python3

import numpy as np
import pbjam

n = 1000
data = np.zeros((2,n))
data[0] = np.arange(n, dtype=float)
data[1] = np.random.randn(n)

s = pbjam.session(ID='mwe', numax=(100, 1), teff=(5000, 100), bp_rp=(0.7, 0.005), dnu=(5, 0.1), timeseries=data)

It fails with the following traceback:

Traceback (most recent call last):
  File "/home/wball/try/pbjam/mwe.py", line 11, in <module>
    s = pbjam.session(ID='mwe', numax=(100, 1), teff=(5000, 100), bp_rp=(0.7, 0.005), timeseries=data)
  File "/home/wball/pypi/PBjam/pbjam/session.py", line 572, in __init__
    _format_col(vardf, timeseries, 'timeseries')
  File "/home/wball/pypi/PBjam/pbjam/session.py", line 289, in _format_col
    vardf[key] = [_arr_to_lk(x, y, vardf['ID'][0], key)]
  File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 3163, in __setitem__
    self._set_item(key, value)
  File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 3239, in _set_item
    value = self._sanitize_column(key, value)
  File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 3899, in _sanitize_column
    value = maybe_convert_platform(value)
  File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 112, in maybe_convert_platform
    values = construct_1d_object_array_from_listlike(values)
  File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1638, in construct_1d_object_array_from_listlike
    result[:] = values
  File "/home/wball/.local/lib/python3.9/site-packages/astropy/table/table.py", line 853, in __array__
    raise ValueError('Datatype coercion is not allowed')
ValueError: Datatype coercion is not allowed

I had a brief look around. The problem isn't the conversion of the timeseries into a Lightkurve object but rather when adding this to the vardf dataframe.

I just did a git pull so I'm using the top of master (commit 0c5591a). If any other versions are relevant, they are:

warrickball commented 3 years ago

@nielsenmb couldn't reproduce this in Python 3.7 and neither can I with Python 3.7.4. I do, however, hit this with Python 3.8.2 and

warrickball commented 3 years ago

Creating the LightCurve object seems to be fine so I had a closer look at why assigning the timeseries that are downloaded via Lightkurve works but passing a custom timeseries doesn't. Mimicking the code in PBjam, I tried this

import numpy as np
import pandas as pd
import lightkurve as lk
import pbjam

n = 1000
data = np.zeros((2,n))
data[0] = np.arange(n, dtype=float)/720
data[1] = np.random.randn(n)

df = pd.DataFrame({'ID': np.array(['test']).reshape((-1,1)).flatten()})
df['timeseries'] = [lk.LightCurve(time=data[0], flux=data[1])]

which is similar to the code path followed for custom timeseries and reproduces the Datatype coercion is not allowed error message.

If I change the last line to this, which tries to be more like the path for downloaded objects (i.e. when you pass a string identifier), it appears to work:

df.at[0, 'timeseries'] = lk.LightCurve(time=data[0], flux=data[1], targetid='test')
nielsenmb commented 3 years ago

vardf.at[0, key] = _arr_to_lk(x, y, vardf['ID'][0], key) returns an error for me on Python 3.7, but

vardf[key] = object()
vardf.at[0, key] = _arr_to_lk(x, y, vardf['ID'][0], key)

seems to work