Open hstone1 opened 7 years ago
Same problem here. I've trained the model on the 500k ChEMBL data set using a 292 dimensions latent space and after 30 epochs I got a loss: 0.4956 and acc: 0.955. However, I'm far from that performance when using a 2D latent space (loss: 2.8712 - acc: 0.7075 after 30 epochs). This is how data looks in the 2D latent space:
from pylab import figure, axes, scatter, title, show
x_latent = model.encoder.predict(data_train)
figure(figsize=(6, 6))
scatter(x_latent[:, 0], x_latent[:, 1], marker='.')
show()
And this is how it looks using the first two principal components from the 292 latent space:
from sklearn.decomposition import PCA
from pylab import figure, axes, scatter, title, show
x_latent = model.encoder.predict(data_train)
pca = PCA(n_components = 2)
x_latent_pca = pca.fit_transform(x_latent)
figure(figsize=(6, 6))
scatter(x_latent_pca[:, 0], x_latent_pca[:, 1], marker='.', s=1)
show()
@hstone1 Could you please explain how did you obtain the image showed above?
Any ideas how to reproduce the Figure displayed in the readme and in the paper?
I would like to be able to generate a picture akin to that displayed in the read me, however even though I converge my model beyond the point in the read me, I do not get the distinct striations shown. Rather I get a more spread out graph still with some striations
Has anyone been able to replicate the Image as displayed in the paper and readme. Was it generated using an actual a 2d latent dim, or a higher dimension then PCAed down to 2d (I have tried both and neither has worked), any help would be greatly appreciated.