leojklarner / gauche

A Library for Gaussian Processes in Chemistry
https://leojklarner.github.io/gauche/
MIT License
213 stars 22 forks source link

Difficulty saving and loading models using`NonTensorialInputs` data #68

Closed kkovary closed 1 month ago

kkovary commented 6 months ago

First off, great work, this is a really cool package!

I've been playing with the graph representation inputs using graphein to a model building off of SIGP (some examples in your codebase call it GraphGP) and have been getting some really great performance out of it. However, I'm struggling to understand how to correctly save and then load the model back into memory for inference after training. If I save the state dict then re-init using that state dict, the model performs as if it had been randomly initialized. I also tried pickling the model (not the ideal solution) I get the following exception:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[88], [line 4](vscode-notebook-cell:?execution_count=88&line=4)
      [1](vscode-notebook-cell:?execution_count=88&line=1) import pickle
      [3](vscode-notebook-cell:?execution_count=88&line=3) with open('model.pkl', 'wb') as file:
----> [4](vscode-notebook-cell:?execution_count=88&line=4)     pickle.dump(model, file)

RuntimeError: Pickling of "rdkit.Chem.rdchem.Atom" instances is not enabled (http://www.boost.org/libs/python/doc/v2/pickle.html)

I tried setting train_inputs to None before saving. This took care of the exception, however I'm back to the original issue where the model seems to be randomly initialized.

I was wondering if you had any guidance here, or if there was something in the docs that I missed. Thanks!

Ryan-Rhys commented 6 months ago

Hi Kyle,

Many thanks for your message! saving the model's state_dict and reinitializing would be the standard solution in GPyTorch. @InfProbSciX put SIGP together if I recall correctly. In the meantime it would be great to get a a full reproduction of exactly how you're reinitializing the state_dict!

Ryan

kkovary commented 6 months ago

Hey thanks for getting back to me. I put together a custom serializer and was able to get the state to successfully save and load. I'll post an update here soon when I get some free time. Again, great work on this!

Ryan-Rhys commented 6 months ago

Great to hear Kyle! Would be great to see your custom serializer and feel free to open a PR!

Ryan-Rhys commented 3 months ago

Hey @kkovary any word on your custom serializer? Would be great to have it as a contribution!

kkovary commented 3 months ago

Thanks for the reminder, I'll submit a PR this weekend!