Open charlesxu90 opened 2 years ago
BTW, why do you use LSTM for scaffold embedding, rather than a linear embedding with position embedding?
For me, the usage of LSTM here feels very strange. If you can use LSTM for scaffold, why not use for all the inputs?
I'm not sure we can point out the problem straight up like this, have you tried using the pretrained weights for generation? LSTM was used to represent the scaffold with a single vector, more for convenience than anything. To take advantage of the transformer architecture we need to use linear and positional embeddings for the main molecule.
Dear authors,
I tried to train the model with default parameters on moses3.csv dataset, then generate with the trained model. However, the validity that I achieved is 0.868, which I think is much smaller than 0.994 as you mentioned on the paper.
Do you have any suggestion for solving this problem? Or can you provide any hints on how to improve this validity. Thanks!