The principle and motivation, why the objective functions are design to related to word embeddings?

... Hope my answer is what you want. Why do the objective functions compute gradients over the word embeddings?

Answer: Because it's adversarial training. You should know the concept of adversarial examples on CV/images. Then using this idea/concept to compute the adversarial examples/perturbations for text inputs. But if you compute a perturbation based on one-hot encoding, it's weird to have perturbations for [0 1 0 0] one-hot encoding. But perturbations on word embedding may make sense.

DevSinghSachan / ssl_text_classification

The principle and motivation, why the objective functions are design to related to word embeddings? #5