No embedding is generated by Word2Vec model of Gensim for some of the annotated terms with a known MeSHID

Irrespective of the data that you use for training the model, be it the provided sample data or the entire RELISH corpus, this error may occur. The reason for this is most likely because of the min_count parameter. In our case, we always set it to 5. This means all those words within the corpus (sample data or the entire RELISH) that have a frequency of less than 5 will be ignored during the training. Such words end up having a embedding not associated to them.

If you look at the code below from the script generate_embeddings.py, you will notice that we use a try-except conditional to skip looking for embeddings for such words with a frequency of less than 5.

for word in article_doc[iteration]:
    try:
        embedding_list.append(word_vectors.wv[word])
    except:
        missing_words += 1

Let me know if you encountered this error while running the script, or if you were specifically looking for the embedding for the above mentioned word. If it is the first case, then I will look into what is causing this error.

zbmed-semtec / word2doc2vec-doc-relevance

No embedding is generated by Word2Vec model of Gensim for some of the annotated terms with a known MeSHID #14