bab2min / tomotopy

Python package of Tomoto, the Topic Modeling Tool
https://bab2min.github.io/tomotopy
MIT License
557 stars 62 forks source link

Explaining messages in CTM model example #168

Open juneMJ opened 2 years ago

juneMJ commented 2 years ago

Hello, I'm new to topic modeling and I'm trying the CTM model example. After loading my own data, which includes about 45000 documents, I get these messages during the process:

  1. TruncMultiNormal.hpp(56): wrong truncation range [..., ...]
  2. Failed to sample! Reset prior and retry!
  3. Adding empty document was ignored.

I just need an explanation of what these messages mean, and if they affect the modeling how can I optimize the model to make it work better?

Thank you!

bab2min commented 2 years ago

Hello @juneMJ, thank you for your interest to my package. The first and second warning are related to numerical stability of tomotopy's CTM implementation. Currently, it seems to be a bug of tomotopy and I'm examining that numerical stability issue. You can see more detail discussion at https://github.com/bab2min/tomotopy/issues/165.

The third warning is happened if you add_doc() with empty words. Since models of tomotopy cannot have empty document, your adding is rejected if you try to add a document with empty words.

juneMJ commented 2 years ago

Thank you so much for clarifying and referring to the other question. Waiting for the update, all the best!