datquocnguyen / LFTM

Improving topic models LDA and DMM (one-topic-per-document model for short texts) with word embeddings (TACL 2015)
Other
178 stars 59 forks source link

LFDMM prediction is vector of NaN for long document #8

Closed strnam closed 6 years ago

strnam commented 6 years ago

Hi a Dat Quoc,

Many thank for your work.

I run LFDMM algorithm on my corpus that mixing short documents and long documents. Checking file LFDMM.theta I found that with the long document that contain more than 76 words the result for that document in file LFDMM.theta is list of NaN "NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN"

Is there any threshold for the length of document for LFDMM algorithm? I look at the source code but still not figure out where it happen.

Hope to see your response. Thank a Dat Quoc

datquocnguyen commented 6 years ago

I am not really sure what happened as I had not evaluated the LFDMM model on long documents yet.