The strength of gensim is in processing large data, using lazy-loading streams.
I noticed your code puts all documents into RAM (as a plain list), so I changed it to use lazy iteration.
Also it seems you're not doing any preprocessing; LDA can be picky about that. So I added rudimentary preprocessing = lowercasing words & removing stopwords.
I tried to follow your non-PEP8 coding style, for visual consistency.
The strength of gensim is in processing large data, using lazy-loading streams.
I noticed your code puts all documents into RAM (as a plain
list
), so I changed it to use lazy iteration.Also it seems you're not doing any preprocessing; LDA can be picky about that. So I added rudimentary preprocessing = lowercasing words & removing stopwords.
I tried to follow your non-PEP8 coding style, for visual consistency.