dongwookim-ml / python-topic-model

Implementation of various topic models
Apache License 2.0
369 stars 172 forks source link

[Question] write a vectorized form of do_e_step method in lda_vb. #10

Closed mahatosourav91 closed 7 years ago

mahatosourav91 commented 7 years ago

Hi I am trying to implement LDA in tensorflow. I am quite new to both tensorflow and LDA. Currently I am following your lda_vb implementation. Is to possible to have a vectorized implemenation (without for loops) of do_e_step method?
If yes, it would very helpful if you provide some insights on how to implement it.

dongwookim-ml commented 7 years ago

Hi, I guess you can parallelise the first loop for d in range(0, self.n_doc) but not sure about the second loop for iter in xrange(self.gamma_iter). you may set the number of iteration for the second loop to 1, but this may slow down the convergence.

mahatosourav91 commented 7 years ago

It would be help if you guide us how to parallelize the outer loop. I am completely new to it.

dongwookim-ml commented 7 years ago

This repository contains the parallel (distributed) version of the variational inference of LDA.

https://github.com/tbroderick/streaming_vb

It seems parallelfiltering.py parallelise the first loop. Check the update_lambda function inside ParallelFiltering class. They also provides some asyncronised posterior update. To understand the details, please see their original paper 'Streaming Variational Bayes'.