Open ImaneChafi opened 4 years ago
[x] METM - Works : https://github.com/li-lab-mcgill/covid19_media/commit/a4f6073fc9ae763bbe5975a337123ddbe6bdc610
[x] DM-ETM - Pending
[x] DETM - Works, NAN values appear
[x] ETM - Works, used this repo for reference : https://github.com/adjidieng/ETM
[ ] S-DETM - Pending
Let’s do thorough experiments to compare DM-ETM and D-ETM with various settings on the 4 datasets (Aylien, GPHIN, GPHIN online parse, WHO):
Try these topic numbers {10, 20, 30, 40, 50}. Run each model for 100 epochs and let the model do annealing and choose the best model based on the Val ppl.
Finally, compare these models in terms of test perplexity. We will report the results in a table (bold-face the best performing model in each category).
We will then pick the best model to do downstream topic analysis: Overall topic popularity. To help annotate the topics, we will average and re-normalize dynamic topic across times, so we have a fixed set of topics to visualize and manually annotate (same as Fig. 2 i have)
For the new GPHIN data created by the GPHIN scraper, experiment on topic mixture ETM, METM, D-ETM and DM-ETM