Closed alexander-turner closed 6 years ago
Hmmm.. Have you seen issue #6 ? Maybe some of the things here are things you're encountering?
One thing is if you try to retrain a pretrained model the large learning rate will likely destroy the current parameters, so you'd want to have a much lower initial learning rate.
I tried lowering Adam's default learning rate to 1/100th the default; pretrained starts at 2% accuracy and trends downwards quickly, reaching .73% accuracy after 17 epochs. I could check the reconstruction accuracy (is there a script for this I'm not seeing?), but the loss is also pretty terrible right now, so I doubt it's secretly performing well.
I've had an issue with the reported accuracy during training being really low. But then when I evaluate the reconstruction accuracy I get the result in the paper. Perhaps @MustafaMustafa could share his code from #9?
Hi @mkusner and @alexander-turner, I can reproduce the paper results with any of the following settings:
1) adam_lr = 5e-4, batchsize = 1000 for 100 epochs (two GPUs) 2) adam_lr = 2e-4 decayed to 1e-4 after roughly 34 epochs, batchsize = 50 for 100 epochs 3) change the latent vector size to 64, the GRU units to 512 and the encoder dense layers to 512 units, use adam_lr = 5e-4, batchsize = 500 for 100 epochs
The best results were from the last two settings. The second is very slow due to the small batch size though. They all give accuracy in the +50% range (accuracy as defined in the paper appendix, not Keras bit by bit accuracy).
@MustafaMustafa Thanks! Do you happen to have the code available for reconstruction accuracy?
I assume this has been figured out. If not I will open this again!
I'm getting very low training performance on
train_zinc.py
, even using the pretrained model and after having regenerated the data multiple times. The accuracy starts at about .15% and trends down to .05%, while loss goes from 2.1 to 1.7.Strangely, BO metrics seem almost unaffected.