adjidieng / ETM

Topic Modeling in Embedding Spaces
MIT License
549 stars 128 forks source link

getting nan as loss for Short reviews #3

Closed Mandark27 closed 4 years ago

Mandark27 commented 5 years ago

Is it applicable for short reviews? Minimum how many words must be there in a review for the model to run excluding the stopwords.

I am getting nan as loss since my output tensor from q_theta is a tensor full of nan.

adjidieng commented 5 years ago

Yes it works for any document size. You must have not have prepared your dataset in the right format or you might have chosen a large learning rate. Also if you choose ReLU as the activation for the inference network for q_theta, make sure you normalize the bag of words input by setting the option bow_norm to 1.

adjidieng commented 4 years ago

We just added the scripts to pre-process a dataset to the repo. Please check that out and let us know if you still have questions.