Closed dacb closed 6 years ago
Great point. I was having some issues with pickle
. I'm in the process of converting everything over to dill
. My understanding is that dill
, unlike pickle
, saves class/methods information. I'll look into HDFS as well
I wasn't aware of dill
, thanks! It doesn't solve the malicious code problem, but it seems to ease sharing across platforms. Cool!
The real problem with HDFS is that you still need to serialize your objects into the HDFS blobs yourself. I'll ask around for some libraries to do that part.
dangerous pickles! https://intoli.com/blog/dangerous-pickles/
Keras has its own serialization to HDF5: https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model
and h5py could be used to serialize other objects into HDF5 as well: http://docs.h5py.org/en/latest/index.html
Since these are not very large datafiles HDF5 may be sufficient
After some finicking with Travis CI, the keras switch is successful
AWESOME!
Pickling objects in Python is a potentially problematic solution for serializing objects. Issues that have identified with pickling include:
An alternative option may include HDFS.