MIND-Lab / OCTIS

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)
MIT License
718 stars 102 forks source link

ETM model: AttributeError: 'list' object has no attribute 'squeeze' #35

Closed cayaluke closed 2 years ago

cayaluke commented 2 years ago

Description

I tried to run the ETM model through OCTIS, but got an attribute error. I've attached my corpus (corpus.csv; for some reason git won't let my attach an actual .tsv file) and vocabulary (vocabulary.txt) for your convenience.

What I Did

Here's what I did

I complied with the format of the dataset as a .tsv and vocabulary as a .txt file with one stem per row.

I was able to load the dataset with no errors.

To run the model I did the following:

from octis.models.ETM import ETM model_etm = ETM(num_topics=40) output_fomc = model_etm.train_model(dataset)


model: ETM(
  (t_drop): Dropout(p=0.5, inplace=False)
  (theta_act): ReLU()
  (rho): Linear(in_features=300, out_features=9659, bias=False)
  (alphas): Linear(in_features=300, out_features=40, bias=False)
  (q_theta): Sequential(
    (0): Linear(in_features=9659, out_features=800, bias=True)
    (1): ReLU()
    (2): Linear(in_features=800, out_f
eatures=800, bias=True)
    (3): ReLU()
  )
  (mu_q_theta): Linear(in_features=800, out_features=40, bias=True)
  (logsigma_q_theta): Linear(in_features=800, out_features=40, bias=True)
)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-159-a49502c35827> in <module>
----> 1 output_fomc = model_etm.train_model(dataset)

~\anaconda3\envs\lda_env36\lib\site-packages\octis\models\ETM.py in train_model(self, dataset, hyperparameters, top_words)
     54 
     55         for epoch in range(0, self.hyperparameters['num_epochs']):
---> 56             continue_training = self._train_epoch(epoch)
     57             if not continue_training:
     58                 break

~\anaconda3\envs\lda_env36\lib\site-packages\octis\models\ETM.py in _train_epoch(self, epoch)
    120             self.model.zero_grad()
    121             data_batch = data.get_batch(self.train_tokens, self.train_counts, ind, len(self.vocab.keys()),
--> 122                                         self.hyperparameters['embedding_size'], self.device)
    123             sums = data_batch.sum(1).unsqueeze(1)
    124             if self.hyperparameters['bow_norm']:

~\anaconda3\envs\lda_env36\lib\site-packages\octis\models\ETM_model\data.py in get_batch(tokens, counts, ind, vocab_size, emsize, device)
     15         #L = count.shape[1]
     16         if len(doc) == 1:
---> 17             doc = [doc.squeeze()]
     18             count = [count.squeeze()]
     19         else:

AttributeError: 'list' object has no attribute 'squeeze'

``
[vocabulary.txt](https://github.com/MIND-Lab/OCTIS/files/7423616/vocabulary.txt)
[corpus.csv](https://github.com/MIND-Lab/OCTIS/files/7423618/corpus.csv)

'
cayaluke commented 2 years ago

@silviatti Hi I fixed the problem. So, I'm gonna close this now. Thank you anyway.