I searched for this error, and it was caused by the word is not present in the training vocabulary. Hence, the FastText model cannot return a meaningful word vector for the input word.
But I already used all the training data in "MathtagArticles" directory.
Is there anything I miss?
ERROR:root:'all ngrams for word \ueae8\uea8c𑇗Ǵ absent from model'
Traceback (most recent call last):
File "/app/TangentCFT/tangent_cft_module.py", line 115, in __get_vector_representation
temp_vector = temp_vector + self.model.get_vector_representation(encoded_tuple)
File "/app/TangentCFT/tangent_cft_model.py", line 45, in get_vector_representation
return self.model.wv[encoded_math_tuple]
File "/home/user/miniconda/lib/python3.9/site-packages/gensim/models/keyedvectors.py", line 169, in getitem
return self.get_vector(entities)
File "/home/user/miniconda/lib/python3.9/site-packages/gensim/models/keyedvectors.py", line 277, in get_vector
return self.word_vec(word)
File "/home/user/miniconda/lib/python3.9/site-packages/gensim/models/keyedvectors.py", line 1622, in word_vec
raise KeyError('all ngrams for word %s absent from model' % word)
KeyError: 'all ngrams for word \ueae8\uea8c𑇗Ǵ absent from model'
The error appear many times with different word.
But after it, it still produced "slt_ret.tsv" file, will it cause any problem for the retrieval result?
I searched for this error, and it was caused by the word is not present in the training vocabulary. Hence, the FastText model cannot return a meaningful word vector for the input word. But I already used all the training data in "MathtagArticles" directory. Is there anything I miss?
ERROR:root:'all ngrams for word \ueae8\uea8c𑇗Ǵ absent from model' Traceback (most recent call last): File "/app/TangentCFT/tangent_cft_module.py", line 115, in __get_vector_representation temp_vector = temp_vector + self.model.get_vector_representation(encoded_tuple) File "/app/TangentCFT/tangent_cft_model.py", line 45, in get_vector_representation return self.model.wv[encoded_math_tuple] File "/home/user/miniconda/lib/python3.9/site-packages/gensim/models/keyedvectors.py", line 169, in getitem return self.get_vector(entities) File "/home/user/miniconda/lib/python3.9/site-packages/gensim/models/keyedvectors.py", line 277, in get_vector return self.word_vec(word) File "/home/user/miniconda/lib/python3.9/site-packages/gensim/models/keyedvectors.py", line 1622, in word_vec raise KeyError('all ngrams for word %s absent from model' % word) KeyError: 'all ngrams for word \ueae8\uea8c𑇗Ǵ absent from model'
The error appear many times with different word. But after it, it still produced "slt_ret.tsv" file, will it cause any problem for the retrieval result?