Coherence crashing for 50 topics LDA model / 40k+ long documents (~20M total tokens)

Trying to compute c_v coherence for a 50 topic LDA model trained on 40k long documents (around 20M total tokens) takes about 15 minutes before crashing the kernel. Using gensim (via the great snippet provided in another issue) works just fine, takes about 2.5 minutes.

I'm running the following code on tomotopy 0.12.3 / python 3.10.8, adapted from the examples repo:

coh_model = Coherence(lda_model_50k, coherence='c_v')
average_coherence = coh_model.get_score()
print(average_coherence)

Any thoughts?

bab2min / tomotopy

Coherence crashing for 50 topics LDA model / 40k+ long documents (~20M total tokens) #191