BabakHemmatian / Gay_Marriage_Corpus_Study

LDA and RNN for Reddit comments
0 stars 0 forks source link

Topic similarity across various iterations #16

Closed BabakHemmatian closed 5 years ago

BabakHemmatian commented 6 years ago
BabakHemmatian commented 6 years ago

Even though we're planning to use topic coherence for num_topics, it might be nice to report JSD for different top topics with the optimized num_topics. There might be interesting patterns in the topics. What do you think?

sabjoslo commented 6 years ago

You mean JSD from each top topic in a given model to each of the other top topics in the same model?

BabakHemmatian commented 6 years ago

Yeah. I feel like the idea of coherence makes the comparison between different models unnecessary at the moment. But it might be interesting to see what the distance is between the top topics in the model with the num_topics we end up settling for. Since JSD is symmetric, we won't have to calculate the divergence on both sides of the comparison either.

sabjoslo commented 6 years ago

@BabakHemmatian, this might be a good sanity check on how the human ratings partition consequentialist/values-based topics. If the within-category JSD between topics is systematically lower than the between-category JSD between topics, that would support our claim that the raters are picking up on a meaningful distinction.