james-bowman / nlp

Selected Machine Learning algorithms for natural language processing and semantic analysis in Golang
MIT License
445 stars 45 forks source link

Online/streaming LDA? #8

Open carbocation opened 6 years ago

carbocation commented 6 years ago

Is it possible to run LDA (or other processing algorithms) in a streaming/online fashion, such as is done with gensim? It seems that this would not easily support online processing, but I thought I'd bounce the question off of you since you know the internals much better.

james-bowman commented 6 years ago

Great question. All the algorithms will work in an online setting but the majority require batch training in advance. Some, like the LDA and RI algorithms could be made to work with online training with a small amount of effort. The HashingVectoriser doesn't require training so is particular suited to streaming data. I will take a look and see if I can add the online training and persistence support. In the meantime, Pull Requests are welcome :-)