DeepGraphLearning / NBFNet

Official implementation of Neural Bellman-Ford Networks (NeurIPS 2021)
MIT License
197 stars 29 forks source link

Error when loading pretrained epoch #9

Closed jwzhi closed 2 years ago

jwzhi commented 2 years ago

Hi,

I tried a new model on NBFNet and tried to load it. But I cannot load it, the issues seem to come from the torchdrug/patch.py. I wonder if you have a good solution on this: Traceback (most recent call last): File "script/run.py", line 60, in solver = util.build_solver(cfg, dataset) File "/shared-datadrive/shared-training/NBFNet/nbfnet/util.py", line 120, in build_solver solver.load(cfg.checkpoint) File "/home/azureuser/.pyenv/versions/nbfnet/lib/python3.8/site-packages/torchdrug-0.1.2-py3.8.egg/torchdrug/core/engine.py", line 231, in load self.model.load_state_dict(state["model"]) File "/home/azureuser/.pyenv/versions/nbfnet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1497, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for KnowledgeGraphCompletion: While copying the parameter named "graph", expected torch.Tensor or Tensor-like object from checkpoint but received <class 'torchdrug.data.graph.Graph'> While copying the parameter named "fact_graph", expected torch.Tensor or Tensor-like object from checkpoint but received <class 'torchdrug.data.graph.Graph'>

And I checked the module in nn.Module is actually overwritten by PatchedModule -> self.model.load_state_dict(state["model"]) (Pdb) nn.Module <class 'torchdrug.patch.PatchedModule'>

JiaangL commented 2 years ago

@jwzhi Hi, I met the same problem. Have you solved this one?

jwzhi commented 2 years ago

This is mainly due to the the updated version of pytorch nn.module. For a temporary solve on NBFNet, see the post here, https://github.com/DeepGraphLearning/torchdrug/issues/89