SEMCOG / semcog_urbansim

7 stars 6 forks source link

Pickle compatibility in PyTable #29

Closed tianxie1995 closed 2 years ago

tianxie1995 commented 2 years ago

Error encounter:

pickle 5 used in python 3.8 PyTable is not compatible with python < 3.8.

Error message while reading table from HDFStore:

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/home/da/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 578, in __getitem__
    return self.get(key)
  File "/home/da/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 770, in get
    return self._read_group(group)
  File "/home/da/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 1764, in _read_group
    return s.read()
  File "/home/da/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 3145, in read
    values = self.read_array(f"block{i}_values", start=_start, stop=_stop)
  File "/home/da/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 2805, in read_array
    ret = node[0][start:stop]
  File "/home/da/anaconda3/lib/python3.7/site-packages/tables/vlarray.py", line 677, in __getitem__
    return self.read(start, stop, step)[0]
  File "/home/da/anaconda3/lib/python3.7/site-packages/tables/vlarray.py", line 821, in read
    outlistarr = [atom.fromarray(arr) for arr in listarr]
  File "/home/da/anaconda3/lib/python3.7/site-packages/tables/vlarray.py", line 821, in <listcomp>
    outlistarr = [atom.fromarray(arr) for arr in listarr]
  File "/home/da/anaconda3/lib/python3.7/site-packages/tables/atom.py", line 1224, in fromarray
    return pickle.loads(array.tostring())
  File "/home/da/anaconda3/lib/python3.7/site-packages/pandas/compat/pickle_compat.py", line 266, in loads
    fd, fix_imports=fix_imports, encoding=encoding, errors=errors
  File "/home/da/anaconda3/lib/python3.7/pickle.py", line 1088, in load
    dispatch[key[0]](self)
  File "/home/da/anaconda3/lib/python3.7/pickle.py", line 1107, in load_proto
    raise ValueError("unsupported pickle protocol: %d" % proto)
ValueError: unsupported pickle protocol: 5

Temporary solution: Converting all tables from pickle protocol 5 to protocol 4

import pandas as pd
import pickle

new_hdf = pd.HDFStore("data/all_semcog_data_02-02-18-final-forecast_newbid_pickle4.h5")
old_hdf = pd.HDFStore( "data/all_semcog_data_02-02-18-final-forecast_newbid.h5", mode="r")

for k in old_hdf.keys():
    df = old_hdf[k]
    pickle.HIGHEST_PROTOCOL = 4
    df.to_hdf(new_hdf, k, mode='a')
    pickle.HIGHEST_PROTOCOL = 5
    print(k)

new_hdf.close()
old_hdf.close()
semcogli commented 2 years ago

how about upgrading conda to a new version?

tianxie1995 commented 2 years ago

I tried updating the conda base environment to 3.8 and it failed after a while saying that it couldn't solve the compatibility issue related to upgrading. I guess we could also create a fresh python 3.8 env and reinstall all the libs back but it would be not worth the effort.

tianxie1995 commented 2 years ago

Migrated VM to mint203 with python 3.9