beringresearch / ivis

Dimensionality reduction in very large datasets using Siamese Networks
https://beringresearch.github.io/ivis/
Apache License 2.0
330 stars 43 forks source link

model_save: optimizer is not compatible with pickle #83

Closed pmbaumgartner closed 3 years ago

pmbaumgartner commented 3 years ago

When attempting to use save_model after fitting a supervised Ivis instance, I get an error when trying to save. It looks like some part of the optimizer is not compatible to be pickled with python.

Replicate:

import ivis
i = ivis.Ivis(embedding_dims=10, n_epochs_without_progress=5)
i.fit(X, y)
i.save_model("model.ivis")
Traceback (most recent call last):
  File "src/ivis_persist.py", line 69, in <module>
    ivises[output].save_model(f"models/{output}.ivis")
  File "/Users/pbaumgartner/anaconda3/envs/env/lib/python3.7/site-packages/ivis/ivis.py", line 404, in save_model
    pkl.dump(self.model_.optimizer, f)
AttributeError: Can't pickle local object 'make_gradient_clipnorm_fn.<locals>.<lambda>'

System Info: Running ivis==2.0.0 on macOS with python 3.7.

pmbaumgartner commented 3 years ago

I also attempted to do this after downgrading to ivis==1.8.4 as well as tensorflow==2.3.0 and save_model works under those versions.

Szubie commented 3 years ago

Hey, thanks for reporting this issue. It sounds like you were encountering your issue on tensorflow==2.4.0 - is that correct? It's possible that there's been a change in the new TensorFlow version that breaks our method of saving models.

I'll look into the problem and see if there's a quick fix. For now it's probably best to downgrade TensorFlow to 2.3 as you've done.

pmbaumgartner commented 3 years ago

Yep, was on tensorflow==2.4.0 in the original attempt as well. It looks like there's a function in tensorflow that returns a lambda function, which are not pickleable using the stdlib pickle module.

One alternative I've seen other packages use is cloudpickle. It adds a dependency, but is drop-in replacement if you're already using pickle.

Szubie commented 3 years ago

Hi, quick update on this issue; we've released a new version of ivis that uses the dill package to save the optimizer, enabling model saving and reloading with tensorflow==2.4.0. cloudpickle also works as a fix, but we settled on dill for now due to our familiarity with the package.

It's possible that model saving will be improved and made more robust with a more significant refactoring of the project in the future, but this should let people save their models in the meantime.