Closed maisamwasti closed 3 years ago
@maisamwasti did you figure out the solution to this?
For the embedding layer, you can set weights=[emb]
where emb = CreateEmbedding()
which will create the embedding, and by that I mean pull it out of the GloVe embeddings dictionary you have pre-loaded in memory.
Introduction: I have trained Autoencoders (vanilla & variational) in KERAS for MNIST images, and have observed how good the latent representation in the bottleneck layer looks for clustering them together.
Objective: I want to do the same for short texts. Tweets specifically! I want to cluster them together based on their semantics using pre-trained GloVe embeddings.
What I am planning to do is create a CNN encoder and a CNN decoder as a start, before moving on to LSTMs/GRUs.
Problem: What should be the correct loss? How do I implement it in Keras?
This is how my KERAS model looks like
INPUT_TWEET (Word indexes) >> EMBEDDING LAYER >> CNN_ENCODER >> BOTTLENECK >> CNN_DECODER >> OUTPUT_TWEET (Word indexes)
This is clearly wrong because it tries to minimize the MSE loss between the Input and the Output (word indexes), where I think it should do it in the embedding layers (embedding_1 and conv1d_2).
Now how do I do it? Does it make sense? Is there a way to do this in Keras? Please check my code below:
The code:
Dont want it to do this:
It is obviously just trying to reconstruct the array of word indexes (zero padded) because of the bad loss.
Question: Does it make sense to you? What should be the correct loss? How do I implement it in Keras?