Open azhe825 opened 7 years ago
can u generate some way of building data sets at increasing distance? see how your conclusions fail as you increase distance?
can u use LDA as a faster way to find relevant topics?
Will try generating synthetic data. Preparing for midterm this week.
What do you mean by "use LDA as a faster way to find relevant topics"? Apply LDA+SVM on FASTREAD? I have a preliminary result showing that LDA+SVM, 100 topics, performs bit better than FASTREAD in one run. So it might be promising as the target is clearly one specific topic.
Preparing for midterm this week.
roger. focus on that
Supported by https://github.com/ai-se/ML-assisted-SLR/blob/master/no_ES/src/runner.py
Data Similarity
LDA on 30 topics (number of topics does not matter much) Topic weighting for the two data sets:
L1 similarity, as default of LDA:
L2 similarity, make more sense:
Target Similarity
LDA on 30 topics Topic weighting for the two relevant set:
L1 similarity, as default of LDA:
L2 similarity, make more sense:
Conclusion:
Problem: