Open guotong1988 opened 5 years ago
... Hope my answer is what you want. Why do the objective functions compute gradients over the word embeddings?
Answer: Because it's adversarial training. You should know the concept of adversarial examples on CV/images. Then using this idea/concept to compute the adversarial examples/perturbations for text inputs. But if you compute a perturbation based on one-hot encoding, it's weird to have perturbations for [0 1 0 0] one-hot encoding. But perturbations on word embedding may make sense.
@DevSinghSachan Thank you!