While working with GraphFrame.to_hdf() I noticed that the pytables backend does not support the current datatype of the nid column, causing:
TypeError: objects of type ``IntegerArray`` are not supported in this context, sorry; supported objects are: NumPy array, record or scalar; homogeneous list or tuple, integer, float, complex or bytes
Most datatypes in Pandas columns get casted to Numpy datatypes . We needed to use pd.Int64Dtype() for time series data since np.int64 does not support NaN values. By casting the "nid" column to np.float64 instead of pd.Int64Dtype(), the issue is fixed with pytables and we can support NaN values for time series.
Summary
While working with
GraphFrame.to_hdf()
I noticed that thepytables
backend does not support the current datatype of thenid
column, causing:Most datatypes in Pandas columns get casted to Numpy datatypes . We needed to use
pd.Int64Dtype()
for time series data sincenp.int64
does not support NaN values. By casting the "nid" column tonp.float64
instead ofpd.Int64Dtype()
, the issue is fixed withpytables
and we can support NaN values for time series.