Closed jej127 closed 1 year ago
Hi,
Thanks for asking.
We used n-gram vectors to impute representations for OOV words for the FastText model. Also, FastText embeddings can be the target teacher for other mimick-like models.
Thanks for assistance. It is really helpful.
Hello! I have a question about the detail of FastText baseline in the Table 2 ~ 4. For this baseline, as handling OOV words, we have two choices: 1) Representing OOV words using null vectors 2) Computing vectors for OOV words by summing the n-gram vectors.
In the context of Bojanowski et al. (2017)[1], the first option corresponds to the "sisg-" setting, while the second aligns with the "sisg" setting. Could you please specify which option was utilized in your experiments? My conjecture leans towards the option 1) because the option 2) doesn't seem to follow a mimick-like model. Nonetheless, I would greatly appreciate your guidance on this matter. Thank you in advance for your help!
[1] Bojanowski et al., Enriching Word Vectors with Subword Information, TACL 2017.