Closed MingyuKim-2933 closed 12 months ago
Hi,
Yes, your understanding is correct. We only use embeddings of single and complete words. Note that learning these embeddings can improve robustness for sentence-level tasks and we can only have marginal improvements for token-level tasks. Maybe you should try distilled embeddings for the latter.
Thank you for your answer!
Hello! I am currently trying to reproduce the LOVE model,
The sentence "For ease of implementation, we learn only from the words that are not separated into pieces." in your paper.
As I understand, In the vocab.txt file you provided, you did not use special tokens (e.g. "[PAD]", "[UNK]") and separated words (e.g. ##ir, ##di).
Is my understanding correct?
Thank you!