Updated pretrained model?

liamnaka commented 7 years ago

First of all I want to thank you all for your immense contributions to this library, it is truly helpful.

As someone without access to a powerful GPU, training these models is very time consuming. I encountered an error when testing out the model_500k.h5 on smiles_50k.h5 with sample_gen.py

I inputed python sample_gen.py smiles_50k.h5 data/model_500k.h5 --target autoencoder and received the error ValueError: Shapes (9, 1, 56, 9) and (9, 1, 55, 9) are not compatible from within the load_weights_from_hdf5_group function.

Perhaps an updated pretrained model is needed? The models I train compile, but are very inaccurate (because of my machine's limitations), so something tells me it has to do with the provided model. I might be able to get an AWS server running to help out if needed.

Regards, Liam

pechersky commented 7 years ago

I think that error comes from the fact that the sample_gen.py assumes a fixed charset that is not equivalent to the one in the pretrained model. I'll take a look at updating the pretrained model. In the meantime, you can try the older sample.py on the pretrained model.

Kfir-Schreiber commented 6 years ago

Hi,

Would like to join @liamnaks thanks. This project is truly helpful.

I get the same error when trying to use sample.py with the pretrained 500k model and the ChEMBL dataset. It seems like the saved weights are from a model that was trained with a charset of size 55 and the charset for ChEMBL has 56 characters.

Would it be possible to upload the original charset that was used to train the model?

Thanks, Kfir

maxhodak / keras-molecules

Updated pretrained model? #45