Closed sweetmals closed 6 years ago
Hi @sweetmals . Issue tracker is not a place for general questions. If you think something is a bug, please provide example and arguments why do you think so.
If you still think there is a bug / issue with coherence measurement - please clearly formulate what is wrong and how you think it should be. You can see actual code in R/coherence.R
Also please take a look on basic text formatting on github - https://help.github.com/articles/basic-writing-and-formatting-syntax/
Maybe the question might be asked on stackoverflow, I will answer it there...
Am 19. Juni 2018 10:39:28 MESZ schrieb Dmitriy Selivanov notifications@github.com:
Closed #270.
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/dselivanov/text2vec/issues/270#event-1688029919
-- sent via mobile - please excuse typos
Hi @manuelbickel That's alright, I've been asking you lot of questions via the other thread too. Thanks a lot for your help.
Hi @manuelbickel
Would be great if you could clarify some of the code lines of coherence.R implementation.
tcm = as.matrix(tcm[top_terms_tcm, top_terms_tcm]) By this time you already have a filtered TCM corresponding to the top terms in the input x and original TCM itself. Correct me if I am wrong.
I am not clear what happens from the following lines within each topic. topic_i_term_indices = match(x[, i], terms_tcm)
remove NA indices - not all top terms for topic 'i' are necessarily included in tcm
topic_i_term_indices = topic_i_term_indices[!is.na(topic_i_term_indices)]
Isn’t it the same you do with taking the intersect of top_terms_unique and terms_tcm, then re-constructing the TCM by this line ‘tcm = as.matrix(tcm[top_terms_tcm, top_terms_tcm])’?
Also I am finding bit hard to understand the computation for 'log(smooth + tcm[x,y]) - log(tcm[y,y]'. Would you be able to explain a bit about taking the transpose and then dividing by the diagonal and then taking log of lower triangular. It seems to me that following 5 lines basically solve the above log equation, but I lack the understanding of how it does. d = diag(res) res = t(res) res = res / d res = res[lower.tri(res)] res = log(res)
Please bear with me for my lack of knowledge.
Thanks a lot.