sebischair / Lbl2Vec

Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.
https://wwwmatthes.in.tum.de/pages/naimi84squl1/Lbl2Vec-An-Embedding-based-Approach-for-Unsupervised-Document-Retrieval-on-Predefined-Topics
BSD 3-Clause "New" or "Revised" License
177 stars 28 forks source link

ValueError: cannot compute mean with no input #4

Open krishnarevi opened 2 years ago

krishnarevi commented 2 years ago

Does this model support german keywords? There is an issue when trying to fit the model with german keywords. Can you please suggest ?

VMD7 commented 2 years ago

Regarding this issue you can remove the epochs field or your can give epoches more than 4. This might help to solve this issue.

adri0 commented 1 year ago

I'm also getting this error when trying to run the code from Tim's article, Unsupervised Text Classification with Lbl2Vec

jarolim14 commented 1 year ago

anyone found the source of the error?

isaldiviagonzatti commented 7 months ago

Same here, getting ` cannot compute mean with no input ' and no way to solve it. @krishnarevi is there any fix to this? what exactly is going on? Thanks

isaldiviagonzatti commented 7 months ago

Regarding this issue you can remove the epochs field or your can give epoches more than 4. This might help to solve this issue.

This does not work.

1jamesthompson1 commented 6 months ago

I am also having this problem.

ValueError: cannot compute mean with no input

i can provide more information if wanted. But this is the model function

Lbl2Vec_model = Lbl2Vec(keywords_list=list(labels.keywords), tagged_documents=full_corpus['tagged_responses'][full_corpus['data_set_type'] == 'train'], label_names=list(labels.class_name), similarity_threshold=0.43, min_num_docs=5, epochs=10)
shashankmc commented 2 months ago

I am facing the same issue and after debugging I found that the problem arises here - https://github.com/sebischair/Lbl2Vec/blob/d9efdf5969c433dfea22673cec69865da4534f38/lbl2vec/lbl2vec.py#L609

I always get a warning stating that the keywords_list are unknown to the Doc2Vec model and therefore will not be used to train. Which in result provides an empty cleaned_keywords_list and an empty keyword_vectors which is passed onto the methoddoc2vec_model.dv.most_similar from Doc2Vec class, there aren't any keyword_vectors, so mean value cannot be computed.

Only option I feel at this point that makes sense (which is also mentioned in the logger warning or info) - change your keywords to words that are present in the doc2vec model or train your own doc2vec model.