Open rohit-das opened 4 years ago
When you get embeddings learned from classification, the results will depend on the underlying data, labels, embedding size, how good the model is. Maybe those words appear together in your dataset. Also you could experiment with the embedding layer size (think about it as number of features representing each word) and retrain the model.
For direct word embedding the output made sense
But how do we understand the relationship between words generated by word embeddings learned from classification
Is there a way to put this in better context?