Trying to compute c_v coherence for a 50 topic LDA model trained on 40k long documents (around 20M total tokens) takes about 15 minutes before crashing the kernel. Using gensim (via the great snippet provided in another issue) works just fine, takes about 2.5 minutes.
I'm running the following code on tomotopy 0.12.3 / python 3.10.8, adapted from the examples repo:
Trying to compute
c_v coherence
for a 50 topic LDA model trained on 40k long documents (around 20M total tokens) takes about 15 minutes before crashing the kernel. Usinggensim
(via the great snippet provided in another issue) works just fine, takes about 2.5 minutes.I'm running the following code on
tomotopy 0.12.3
/python 3.10.8
, adapted from the examples repo:Any thoughts?