Open vdpappu opened 4 years ago
The ideal goal for the topic change detection is to, slice the meeting into multiple partition where each partition carries enough information to redeem itself as a discussion.
The following needs to be addressed to achieve this:
going with our current implementation, I made few extra implementations to try to fix the last issue.
doing this increased the accuracy of the communities by a large amount and no overlapping groups would be formed.
To improve the current communities approach (the one on staging) or to be precise, to understand what is best for communities, I went through few papers and methods to understand how effective it can be. Based on that I made few changes to the algorithm.
Instead of fully connected network, we connect two sentences only if they are either from same segment or from the next. This helps to reduce cosine similarity noise.
Normalizing the graph is now a bit different. we compute local normalization score for each node and then for the overlapping edge values, we average the score.
community approach relies on self-loops, so that is also added.
Based on this paper https://arxiv.org/pdf/0812.1770.pdf , we add another resolution parameter t, which helps to control the stability of the network.
Based on the validation set, the accuracy increased form 47 percent to 79 percent.
Let's discuss the potential next steps for improving topic change detection and update the activities here.