slryou41 / reaction-gcnn

Chainer implementation of Graph Neural Networks for the Prediction of Substrate-Specific Organic Reaction Conditions
10 stars 8 forks source link

Problem with GCNNs other than NFP #2

Closed Nanotekton closed 3 years ago

Nanotekton commented 3 years ago

Hi there, I was trying to train your model on a set of Suzuki couplings at my disposal, yet I've got the following error:

Traceback (most recent call last): File "reaction-gcnn/train.py", line 420, in <module> main() File "reaction-gcnn/train.py", line 373, in main args.conv_layers, class_num) File "reaction-gcnn/train.py", line 224, in set_up_predictor n_layers=conv_layers) TypeError: __init__() got an unexpected keyword argument 'hidden_dim' The same error appears when I use train_attention_model.py instead of train.py. I suppose this may be related to the version of chainer/chainer_chemistry (I installed this just with pip). Could you specify which version did you use in your project?

Here, I have chainer_chemistry=0.7.1 and chainer=7.8.0.

As a side note, a script transforming reaxys CSV into training data could be useful (I found only dictionary preparation in the repo). I think I retro-engineered this, hope that the result is correct.

michaelmaser commented 3 years ago

Hello,

Yes, it could possibly be version-related, though I haven't had a chance to dig into it much. I have chainer_chemistry=0.7.0 and chainer=7.2.0. I don't know if I have updated since publishing this. Hope this helps for now. Maybe @slryou41 has it logged?

As a side note, a script transforming reaxys CSV into training data could be useful (I found only dictionary preparation in the repo).

I agree, thanks! Unfortunately, due to the nature of the data, the "script" was actually a lot of manual filtering for missing values, incorrect/unreasonable values, mis-spellings, too many or too few reactants/products etc. This could potentially be scripted to some extend, but the rules for catching the bad labels and supplying the correct ones would be tricky. I can try to put something together in the future if time allows. The dictionary preparation is the most important piece, otherwise your implementation should work fine. I believe this preprint has some documentation on a similar process: https://chemrxiv.org/engage/chemrxiv/article-details/60c74d62567dfeb643ec5333. Hope this helps.

Nanotekton commented 3 years ago

Thanks. Indeed, I managed to run rel-gcn model after adjusting the versions. However, with rel-gat I have the following issue when using train_attention_model.py: File "/home/wbeker/miniconda3/envs/reaction_gcn_env/lib/python3.7/site-packages/chainer/optimizer.py", line 871, in update loss = lossfun(*args, **kwds) File "/home/wbeker/projects/suzuki/reaction_gcn_baseline/reaction-gcnn/models/classifier.py", line 174, in __call__ self.y = self.predictor(*args, **kwargs) File "reaction-gcnn/train_attention_model.py", line 72, in __call__ h = self.aggr_attention([h1, h2, h3]) File "/home/wbeker/projects/suzuki/reaction_gcn_baseline/reaction-gcnn/models/attention_readout.py", line 52, in __call__ g1 = functions.sigmoid(self.i_layer(h1)) File "/home/wbeker/miniconda3/envs/reaction_gcn_env/lib/python3.7/site-packages/chainer_chemistry/links/connection/graph_linear.py", line 36, in __call__ s0, s1, s2 = h.shape ValueError: not enough values to unpack (expected 3, got 2)

I confirmed that aggr_attention receives three tensors, each one with shape (32,128). Then, indeed GraphLinear expectes a 3D tensor (batch, n_atoms, n_channels), which is then reshaped to (batch*n_atoms, n_channels), supplied to parent's call method and reshaped back to the original form. In theory, I could just make an 'if' block depending on the rank of the tensors, but I'm not sure whether it is correct - I''m not familiar with this implementation.

With train.py I've got some memory allocation errors, present even if I threw out reactions with products larger than 50 atoms and restricted the training set to first 2000 reactions. Since our RTX 2080 has about 11GB memory, it looks strange. I guess this indicate that I should use train_attention_model.py.

Could you confirm which script should I use for rel-GAT?

Unfortunately, due to the nature of the data, the "script" was actually a lot of manual filtering for missing values, incorrect/unreasonable values, mis-spellings, too many or too few reactants/products etc.

Yes, I know the issue. However, I think that a sample (even with 'fake' data) could be useful - after all, the point of my 'retro-engineering' was to deduce the format of the resulting *.csv.

slryou41 commented 3 years ago

Hi, thanks for your interest in our work! train_attention_model.py was originally designed only for the rel-gcn model, which was the best performing graph neural network model from the ones that we tried at that time (so the attention layer has the same operation scheme as the rel-gcn model). The output format of rel-gat model is different from the rel-gcn model, so we have to dig into it if we want to apply it to rel-gat model. You should use train.py script to use rel-gat model.

With train.py I've got some memory allocation errors, present even if I threw out reactions with products larger than 50 atoms and restricted the training set to first 2000 reactions.

Yes, rel-gat model is computationally heavier than other models, so you have to change the batch size to fit your GPU memory allocation. You can specify it by calling --batchsize [N] in the training command ( python train.py -m relgat --batchsize 12 ... )

Yes, I know the issue. However, I think that a sample (even with 'fake' data) could be useful - after all, the point of my 'retro-engineering' was to deduce the format of the resulting *.csv.

Thanks for your suggestion! We will add the fake csv files in the near future.

Nanotekton commented 3 years ago

Thanks a lot, now it seems working. Closing.