MIND-Lab / OCTIS

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)
MIT License
734 stars 106 forks source link

AttributeError: 'KeyedVectors' object has no attribute 'wv' #31

Closed iraari closed 3 years ago

iraari commented 3 years ago

Description

Trying to evaluate a model using the WordEmbeddingsInvertedRBOCentroid() method I get an attribute error "'KeyedVectors' object has no attribute 'wv'"

What I Did

from octis.dataset.dataset import Dataset
dataset = Dataset()
dataset.load_custom_dataset_from_folder("custom_dataset")

from octis.models.LDA import LDA
model_LDA_15 = LDA(num_topics=15) 
model_LDA_15_output = model_LDA_15.train_model(dataset)

from octis.evaluation_metrics.diversity_metrics import WordEmbeddingsInvertedRBOCentroid
rbo_centroid_metric = WordEmbeddingsInvertedRBOCentroid()
topic_rbo_centroid_score = rbo_centroid_metric.score(model_LDA_15_output)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-45-eb5075095fc9> in <module>
      1 from octis.evaluation_metrics.diversity_metrics import WordEmbeddingsInvertedRBOCentroid
      2 rbo_centroid_metric = WordEmbeddingsInvertedRBOCentroid()
----> 3 topic_rbo_centroid__score = rbo_centroid_metric.score(model_LDA_15_output)

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/octis/evaluation_metrics/diversity_metrics.py in score(self, model_output)
    174                 indexed_list1 = [word2index[word] for word in list1]
    175                 indexed_list2 = [word2index[word] for word in list2]
--> 176                 rbo_val = weirbo_centroid(
    177                     indexed_list1[:self.topk], indexed_list2[:self.topk], p=self.weight, index2word=index2word,
    178                     word2vec=self.wv, norm=self.norm)[2]

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/octis/evaluation_metrics/word_embeddings_rbo_centroid.py in word_embeddings_rbo(list1, list2, p, index2word, word2vec, norm)
    145     args = (list1, list2, p, index2word, word2vec, norm)
    146 
--> 147     return RBO(rbo_min(*args), rbo_res(*args), rbo_ext(*args))
    148 
    149 

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/octis/evaluation_metrics/word_embeddings_rbo_centroid.py in rbo_min(list1, list2, p, index2word, word2vec, norm, depth)
     79     """
     80     depth = min(len(list1), len(list2)) if depth is None else depth
---> 81     x_k = overlap(list1, list2, depth, index2word, word2vec, norm)
     82     log_term = x_k * math.log(1 - p)
     83     sum_term = sum(

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/octis/evaluation_metrics/word_embeddings_rbo_centroid.py in overlap(list1, list2, depth, index2word, word2vec, norm)
     59     # NOTE: comment the preceding and uncomment the following line if you want
     60     # to stick to the algorithm as defined by the paper
---> 61     ov = embeddings_overlap(list1, list2, depth, index2word, word2vec, norm=norm)[0]
     62     # print("overlap", ov)
     63     return ov

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/octis/evaluation_metrics/word_embeddings_rbo_centroid.py in embeddings_overlap(list1, list2, depth, index2word, word2vec, norm)
     41     word_list2 = [index2word[index] for index in list2]
     42 
---> 43     centroid_1 = np.mean([word2vec.wv[w] for w in word_list1[:depth]], axis=0)
     44     centroid_2 = np.mean([word2vec.wv[w] for w in word_list2[:depth]], axis=0)
     45     cos_sim = 1 - distance.cosine(centroid_1, centroid_2)

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/octis/evaluation_metrics/word_embeddings_rbo_centroid.py in <listcomp>(.0)
     41     word_list2 = [index2word[index] for index in list2]
     42 
---> 43     centroid_1 = np.mean([word2vec.wv[w] for w in word_list1[:depth]], axis=0)
     44     centroid_2 = np.mean([word2vec.wv[w] for w in word_list2[:depth]], axis=0)
     45     cos_sim = 1 - distance.cosine(centroid_1, centroid_2)

AttributeError: 'KeyedVectors' object has no attribute 'wv'
silviatti commented 3 years ago

Hi! Thanks for reporting the issue.

I have just released a new version of OCTIS that handles this problem (v 1.9.0). You can run pip install -U octis to download the latest version. Feel free to reopen the issue if you still have problems.

Silvia

electron-v commented 2 years ago

please refer below : https://stackoverflow.com/a/73931420/12008176