fani-lab / SEERa

A framework to predict the future user communities in a text streaming social network based on the users’ topics of interest.
Other
4 stars 5 forks source link

Neural topic modeling methods #59

Open soroush-ziaeinejad opened 1 year ago

soroush-ziaeinejad commented 1 year ago

Hi @Lillliant,

This issue page is for your second task on SEERa. As you know, SEERa has a layered structure and its second layer is tml (topic modeling layer). For now, we have 3 methods in SEERa to do the topic modeling: lda, gsdmm, and btm. Since none of them is a neural model (which probably works better than our current models), we decided to add at least one neural topic modeling method to SEERa. As the first step, please do some research about the current neural topic modeling methods and report your findings. Then we can discuss them and decide which ones should be added to SEERa. I can share two links with you to take a look as well: https://github.com/MaartenGr/BERTopic https://github.com/MilaNLProc/contextualized-topic-models

Please let us know if you have any questions regarding this task.

@hosseinfani, please feel free to add your comments on this task.

Lillliant commented 1 year ago

Based on my search, neural topic models are usually based on these methods:

Many of these models that I've found also includes performance benchmarks (ex. this paper examines performance for BERTopic and CTM). It seems that CTM is better in topic diversity, but BERTopic is better in topic coherence. However, I'm not sure if I understand the benchmarks enough to make an accurate judgement.

I tried to understand the different terms by reading this paper. Please let me know if something's inaccurate or unclear!

hosseinfani commented 1 year ago

@Lillliant Thank you.

@soroush-ziaeinejad @Hamedloghmani @farinamhz @ZahraTaherikhonakdar @yogeswarl @karan96 This is an instance of a short but nice literature review which is done in 7 days by an undergrad student who is getting close to final exams having 3-4 courses! Do you have such a thing for your research domain after months/years of work?!