Closed slwatkins closed 4 years ago
@slwatkins, unfortunately, I think deepdish
is unmaintained.
Yeah, I kind of figured based on the commit history... Mostly wanted to put this here in case anyone else runs into the same issue, and so the time I spent on this bug didn't go completely to waste!
We forked the project and made a minimal version of the library for file loading/saving only, where we fixed this bug. Probably you already have your own fork but just in case https://github.com/portugueslab/flammkuchen
Adding the one line of code suggested by @slwatkins solves the same issue for me. I want to say thank you for your effort and sharing.
It is sad to hear that deepdish is no more maintained. This is one of my favorite packages and it seems best to be merged into pandas itself.
When saving a
pd.Series
,pd.DataFrame
, orpd.Panel
to HDF5 using deepdish, anAttributeError
is raised, and I cannot save the file. I've tracked down the issue, and it's due to a change in Pandas version 0.24.0.Here is how I've been able to reproduce the error, where I have installed Pandas 0.24.2, Numpy 0.15.4, deepdish 0.3.6, and PyTables 3.5.1.
The error returned is:
From the above, we see that the
_table_mod
variable is None, which is throwing the error. The reason that this is now an error is related to https://github.com/pandas-dev/pandas/pull/22919, where the exception inHDFStore.get_node
was changed from a bare exception to a specific exception.Before: https://github.com/pandas-dev/pandas/blob/2d0c96119391c85bd4f7ffbb847759ee3777162a/pandas/io/pytables.py#L1157-L1165
After: https://github.com/pandas-dev/pandas/blob/master/pandas/io/pytables.py#L1141-L1149
So, now the
_table_mod
variable is used to only return None in the case that the exception is aNoSuchNodeError
, rather than any error. However,_table_mod
should be set by running of the functionpandas.io.pytables._tables
, which imports PyTables into the namespace as_table_mod
. If this function is not run, then_table_mod
is left as None, and the aboveAttributeError
occurs.The problem is that in deepdish's use of
pandas.io.pytables.HDFStore
, where there's a wrapper of the function called_HDFStoreWithHandle
, none of the methods that call the_tables
function are called, and_table_mod
is left as None, which gives us theAttributeError
.My proposed solution is to add one line to the beginning
hdf5io.py
file in deepdish, where we call thepandas.io.pytables._tables
.Before:
https://github.com/uchicago-cs/deepdish/blob/01af93621fe082a3972fe53ba7375388c02b0085/deepdish/io/hdf5io.py#L1-L12
After:
After making this change, I no longer get the
AttributeError
and the saving of Pandas data types works seamlessly.