Savvysherpa / slda

Cython implementations of Gibbs sampling for supervised LDA
MIT License
61 stars 11 forks source link

“underflow” when running sLDA #6

Open niyikai opened 6 years ago

niyikai commented 6 years ago

There is no problem when running on small dataset, but “underflow” happens when I change to the large data. Can someone help with this problem? Why this error happens and how to solve it.

bearnshaw commented 6 years ago

@niyikai It would be helpful if you posted the error with traceback here.

niyikai commented 6 years ago

@bearnshaw Thank you so much, here is the traceback

start iterations gsl: exp.c:113: ERROR: underflow Default GSL error handler invoked. Abort trap: 6

bearnshaw commented 6 years ago

Hmm, hard to tell where the underflow is without the compiled c file. My guess is that in one of the log likelihood functions in _topic_models.pyx, a zero or very small number is being passed to one of the lngamma functions. We use lngamma instead of gamma to protect against overflow, but it looks like it might be better to have some logic that chooses which to use depending on the size of the input. Feel free to test that out and submit a pull request if it works.

niyikai commented 6 years ago

@bearnshaw Thanks a lot! I found that the real cause was my computer did not have enough memory.