Describe the bug
Pandas DataFrames created within glue and added to the data_collection manager may have columns of type 'object', which mean they cannot be save/restored by glue (glue.core.state._load_numpy calls np.load()without allow_pickle=True). This is generally not a problem when reading files using the Pandas data_factory (which converts columns), but does, for instance cause problems for datasets retrieved from external sources within a glue session.
To Reproduce
Steps to reproduce the behavior such as:
Describe the bug Pandas DataFrames created within glue and added to the data_collection manager may have columns of type 'object', which mean they cannot be save/restored by glue (
glue.core.state._load_numpy
callsnp.load()
withoutallow_pickle=True
). This is generally not a problem when reading files using the Pandas data_factory (which converts columns), but does, for instance cause problems for datasets retrieved from external sources within a glue session.To Reproduce Steps to reproduce the behavior such as:
Expected behavior Pandas objects created within glue should not break session files.
We could simply add
allow_pickle
tonp.load()
, but perhaps this has undesired side effects?Details:
Additional context Sample session file attached: pandas_dataframe_session.glu.gz