Issue replicating graph

Same problem here. I've trained the model on the 500k ChEMBL data set using a 292 dimensions latent space and after 30 epochs I got a loss: 0.4956 and acc: 0.955. However, I'm far from that performance when using a 2D latent space (loss: 2.8712 - acc: 0.7075 after 30 epochs). This is how data looks in the 2D latent space:

from pylab import figure, axes, scatter, title, show x_latent = model.encoder.predict(data_train) figure(figsize=(6, 6)) scatter(x_latent[:, 0], x_latent[:, 1], marker='.') show()

And this is how it looks using the first two principal components from the 292 latent space:

from sklearn.decomposition import PCA from pylab import figure, axes, scatter, title, show x_latent = model.encoder.predict(data_train) pca = PCA(n_components = 2) x_latent_pca = pca.fit_transform(x_latent) figure(figsize=(6, 6)) scatter(x_latent_pca[:, 0], x_latent_pca[:, 1], marker='.', s=1) show()

@hstone1 Could you please explain how did you obtain the image showed above?

Any ideas how to reproduce the Figure displayed in the readme and in the paper?

maxhodak / keras-molecules

Issue replicating graph #71