Charges are not reproducible when loading a pretrained model from disk file

FloLangenfeld commented 10 months ago

I'm toying with Espaloma for a couple of days and switched from version 0.2.4 to 0.3.1 yesterday. I'm encountering an unexpected behaviour when loading a model file from disk: the charges computed by Espaloma differ between two runs for the same molecule.

As a MWE, I use the README molecule:

import torch
import espaloma as esp
from openff.toolkit.topology import Molecule

molecule = Molecule.from_smiles("CN1C=NC2=C1C(=O)N(C(=O)N2C)C")
molecule_graph = esp.Graph(molecule)

espaloma_model = torch.load("espaloma-latest.pt")   # downloaded from the Espaloma github releases page
espaloma_model(molecule_graph.heterograph)

print(molecule_graph.nodes["n1"].data["q"])

If I run this script twice, I obtain different charges. This was not the case for the v0.2.4.

Also, when I use espaloma_model = esp.get_model("latest"), the charges are the same.

Any idea why I get this behaviour, please ?

mikemhenry commented 10 months ago

@FloLangenfeld Thank you for opening this issue!

What do you mean:

Also, when I use espaloma_model = esp.get_model("latest"), the charges are the same.

Are you saying that the charges stay the same when using that line but, but change when you download the file and load it manually?

If you could (just to help me and the team troubleshoot this) post an example where the charges change when you the script twice, and an example where they stay the same, that would be great!

Also could you paste the in the output of conda list as well?

Thanks! Mike

ijpulidos commented 10 months ago

I can reproduce this, the following code shows that even though the weights and biases seem to be the same for both the locally loaded and the remotely loaded models, the charges are different:

import torch
import espaloma as esp
from openff.toolkit.topology import Molecule

molecule = Molecule.from_smiles("CN1C=NC2=C1C(=O)N(C(=O)N2C)C")
molecule_graph_local = esp.Graph(molecule)
molecule_graph_latest = esp.Graph(molecule)

for _ in range(10):
    espaloma_local_model = torch.load("/home/user/espaloma/espaloma-0.3.1.pt")
    espaloma_latest_model = esp.get_model(version="latest")
    espaloma_local_model(molecule_graph_local.heterograph)
    espaloma_latest_model(molecule_graph_latest.heterograph)
    for key, value in espaloma_latest_model.state_dict().items():
        values_equality = torch.equal(value, espaloma_local_model.state_dict()[key])
        if not values_equality:
            print(f"Key: {key}, equal: {values_equality}")
    charges_local = molecule_graph_local.nodes["n1"].data["q"]
    charges_latest = molecule_graph_latest.nodes["n1"].data["q"]
    print(torch.equal(charges_local, charges_latest))

This prints a bunch of False for me, which means the charges are different between the locally loadad and the remote one.

I can also reproduce that the remote model always predicts the same charges whereas the locally stored one doesn't. I wonder if this is something with torch.load that we are missing, since I only see that as the difference (even though weight and biases are the same!). I'm not a torch expert so maybe others can chime in here.

mikemhenry commented 10 months ago

Looking at https://github.com/choderalab/espaloma/blob/main/espaloma/utils/model_fetch.py I believe you need to call model.eval() before using it.

ijpulidos commented 10 months ago

Yes, that does the trick, good catch!

If I add espaloma_local_model.eval() right after loading the model, it gives the expected results.

mikemhenry commented 10 months ago

Cool!

So @FloLangenfeld try running espaloma_model.eval() after you load the model and see if that fixes things, like this:

import torch
import espaloma as esp
from openff.toolkit.topology import Molecule

molecule = Molecule.from_smiles("CN1C=NC2=C1C(=O)N(C(=O)N2C)C")
molecule_graph = esp.Graph(molecule)

espaloma_model = torch.load("espaloma-latest.pt")   # downloaded from the Espaloma github releases page
espaloma_model.eval()  # Need to call this when loading a model with torch.load
espaloma_model(molecule_graph.heterograph)

print(molecule_graph.nodes["n1"].data["q"])

FloLangenfeld commented 10 months ago

Are you saying that the charges stay the same when using that line but, but change when you download the file and load it manually?

Yes, that's my point.

Cool!

So @FloLangenfeld try running espaloma_model.eval() after you load the model and see if that fixes things, like this:
import torch
import espaloma as esp
from openff.toolkit.topology import Molecule

molecule = Molecule.from_smiles("CN1C=NC2=C1C(=O)N(C(=O)N2C)C")
molecule_graph = esp.Graph(molecule)

espaloma_model = torch.load("espaloma-latest.pt")   # downloaded from the Espaloma github releases page
espaloma_model.eval()  # Need to call this when loading a model with torch.l
espaloma_model(molecule_graph.heterograph)

print(molecule_graph.nodes["n1"].data["q"])
Works perfectly, I now get the same results with both methods (remotely or locally loading the model) ! Thanks @mikemhenry and @ijpulidos for the quick replies !

For the record, here is the conda environment conda_env.txt I'm using, and the charges computed twice with the latest pretrained model loaded locally (without the espaloma_model.eval() step), then the same with a model loaded remotely (esp.get_model(version="latest"), and finally the charges when using the espaloma_model.eval() after loading a local copy of the latest pretrained model charges.txt.

I think it might worth to mention this point somewhere in the doc. What do you think?

ijpulidos commented 10 months ago

@FloLangenfeld Yes, I agree, this is probably something that we should improve in the documentation. Already raised the issue and contributed the changes to the docs. Thanks!

FloLangenfeld commented 10 months ago

Perfect @ijpulidos ! Closing this issue, then.

Thank you again!

choderalab / espaloma

Charges are not reproducible when loading a pretrained model from disk file #181