Issues to proxy model - Githubissues

XUHAN314 commented 1 year ago

I found proxy model can not predict rewards for graphs. Here is what I got:

no arrays Namespace(R_min=0.1, array='', balanced_loss=True, bootstrap_tau=0, clip_grad=0, clip_loss=0, early_stop_reg=0.1, floatX='float64', ignore_parents=False, include_nblocks=False, initial_log_Z=30, leaf_coef=10, learning_rate=0.0005, log_reg_c=2.5e-05, max_blocks=8, mbsize=4, min_blocks=2, model_version='v4', nemb=256, num_conv_steps=10, num_iterations=250000, objective='tb', opt_beta=0.9, opt_beta2=0.999, opt_epsilon=1e-08, print_array_length=False, progress='yes', proxy_path='./data/pretrained_proxy', random_action_prob=0.05, replay_mode='online', repr_type='block_graph', reward_exp=10, reward_norm=8, run=0, sample_prob=1, save_path='results/', shift=0, weight_decay=0) v4 v: 2 exception mat2 must be a matrix, got 1-D tensor joining python-BaseException Traceback (most recent call last): File "/home/xu/GFN_vs_HVI/mols/gflownet.py", line 917, in raise e File "/home/xu/GFN_vs_HVI/mols/gflownet.py", line 909, in main(args) File "/home/xu/GFN_vs_HVI/mols/gflownet.py", line 697, in main train_model_with_proxy(args, model, proxy, dataset, do_save=True) File "/home/xu/GFN_vs_HVI/mols/gflownet.py", line 540, in train_model_with_proxy minibatch = dataset.sample2batch(dataset.sample(mbsize)) File "/home/xu/GFN_vs_HVI/mols/gflownet.py", line 362, in sample trajectories = [self._get_sample_model() for i in range(n)] File "/home/xu/GFN_vs_HVI/mols/gflownet.py", line 362, in trajectories = [self._get_sample_model() for i in range(n)] File "/home/xu/GFN_vs_HVI/mols/gflownet.py", line 229, in _get_sample_model r = self._get_reward(m) File "/home/xu/GFN_vs_HVI/mols/gflownet.py", line 272, in _get_reward return self.r2r(normscore=self.proxy_reward(m)) File "/home/xu/GFN_vs_HVI/mols/gflownet.py", line 447, in call return self.proxy(m, do_stems=False)[1].item() File "/home/xu/anaconda3/envs/gflow/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, kwargs) File "/home/xu/GFN_vs_HVI/mols/model_atom.py", line 201, in forward return self.mpnn(graph, vec, do_stems=do_stems, do_bonds=do_bonds, k=k, do_dropout=do_dropout) File "/home/xu/anaconda3/envs/gflow/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, *kwargs) File "/home/xu/GFN_vs_HVI/mols/model_atom.py", line 103, in forward m = self.act(self.conv(out, data.edge_index, data.edge_attr)) File "/home/xu/anaconda3/envs/gflow/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/home/xu/anaconda3/envs/gflow/lib/python3.8/site-packages/torch_geometric/nn/conv/nn_conv.py", line 101, in forward out = self.propagate(edge_index, x=x, edge_attr=edge_attr, size=size) File "/home/xu/anaconda3/envs/gflow/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 317, in propagate out = self.message(msg_kwargs) File "/home/xu/anaconda3/envs/gflow/lib/python3.8/site-packages/torch_geometric/nn/conv/nn_conv.py", line 113, in message weight = self.nn(edge_attr) File "/home/xu/anaconda3/envs/gflow/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, *kwargs) File "/home/xu/anaconda3/envs/gflow/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward input = module(input) File "/home/xu/anaconda3/envs/gflow/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/home/xu/anaconda3/envs/gflow/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 103, in forward return F.linear(input, self.weight, self.bias) RuntimeError: mat2 must be a matrix, got 1-D tensor

Process finished with exit code 1

malkin1729 commented 1 year ago

Can you please share your version of torch and torch_geometric, the command you are running, and if you made any changes to the code?

XUHAN314 commented 1 year ago

Thanks for you reply! I run gflownet.py directly without any change to the code. I am using torch 2.0.0 and pyg 2.3.0. I also tried old version (torch 1.11) but it doesn't work as well. I can not install requirements.txt since the version is too out of date and pip/conda seems not to support.

XUHAN314 commented 1 year ago

I know where problems come from. When loading the parameters from the pre-trained model, the linear model get the wrong parameters (because of the order of parameters are not consistent). Do you have any suggestions to load the model correctly? Thanks!

malkin1729 commented 1 year ago

I checked with the authors of the NeurIPS 2021 GFlowNet paper's code, which the code here is based upon. They say the issue is with the version of PyTorch Geometric (newer versions will not work). Installing the right version (1.6.3) is tricky, but you can get wheels directly from the PyTorch Geometric project website.

This is obviously not an ideal solution for long-term maintainability. There may be workarounds with newer versions that manually modify parameters into the right shapes. Depending on what you are wanting to use this code for, you may also be interested in the GFlowNet-for-molecules library by Recursion that is compatible with newer libraries.

GFNOrg / GFN_vs_HVI

Issues to proxy model #1