li-lab-mcgill / covid19_media

6 stars 2 forks source link

Experiment Intervention predictions using topic mixture on GPHIN data #3

Open ImaneChafi opened 4 years ago

ImaneChafi commented 4 years ago

For the new GPHIN data created by the GPHIN scraper, experiment on topic mixture ETM, METM, D-ETM and DM-ETM

ImaneChafi commented 4 years ago
ImaneChafi commented 4 years ago

Let’s do thorough experiments to compare DM-ETM and D-ETM with various settings on the 4 datasets (Aylien, GPHIN, GPHIN online parse, WHO):

Try these topic numbers {10, 20, 30, 40, 50}. Run each model for 100 epochs and let the model do annealing and choose the best model based on the Val ppl.

Finally, compare these models in terms of test perplexity. We will report the results in a table (bold-face the best performing model in each category).

We will then pick the best model to do downstream topic analysis: Overall topic popularity. To help annotate the topics, we will average and re-normalize dynamic topic across times, so we have a fixed set of topics to visualize and manually annotate (same as Fig. 2 i have)

  1. Country-specific topic popularity based on the DM-ETM results
  2. Correlating dynamic topic with confirmed cases and deaths across times
  3. (if space allows) predict interventions (comparing the best D-ETM and DM-ETM chosen based on the above val ppl) using a separate classifier (we will explore semi-supervised DM-ETM and D-ETM in the next paper)