Closed BenoitFayolle closed 2 years ago
I tried to speed up the logic which I orignally implemented at https://github.com/bnosac/textplot/blob/d0c40fb84738c0588a4e08784d30eabea26dd52a/R/textplot_biterms.R#L232-L233 but the implementation wasn't the correct speedup. Later on, I make sure only biterms with terms highly emitted by each topic are shown at https://github.com/bnosac/textplot/blob/d0c40fb84738c0588a4e08784d30eabea26dd52a/R/textplot_biterms.R#L246 This was done to make the graph crisp. So a bug clearly but probably not occurring that much unless you really have completely overlapping topics.
I think you are responding to my other issue but this one is different. I can send a reprex tomorrow
Nevermind, I just saw your commit to fix this issue 👍
I pushed the package on CRAN just now.
https://github.com/bnosac/textplot/blob/d0c40fb84738c0588a4e08784d30eabea26dd52a/R/textplot_biterms.R#L230-L231 Correct me if I'm wrong but these don't actually pick the best/most frequent topic.
topic_freq
gives the number of occurrences of each biterm in the whole corpus sincetopic
is not included in theby
argument of the first line. Hence second line picks the maximum of a variable that is constant within each group