Open lounily opened 7 years ago
Firstly, the format of the word_similarity file is
s(0,0) s(0,1) ... s(0,|V|)
s(1,0) s(1,1) ... s(1,|V|)
... ... ... ...
s(|V|,0) s(|V|,1) ... s(|V|,|V|)
In details, s(i,j) means the cosine similarity between ith word' embedding and jth word's embedding, |V| is the size of vocabulary, every item in same line is separated by a blank space.
Secondly, the format of the qa_word2id file is
word,id
... ...
word,id
what is the meaning of the result file _theta.txt and _assign.txt ? and which files are responding to the result file of the LDA ? If I want to compute the perplexity of the model ,which file would I use? thank you very much @NobodyWHU @duanyu
hi, i want ask what is format of the _snippet_200iter_initial_status.txt? thanks@NobodyWHU
In fact we do not use the initialFile in our experiments of SIGIR paper, we randomly initialize the topic assignments, and that's just experimental codes but we forget to delete it, we will fix that mistake :) I guess you will see the correct version in Github tomorrow. @YaYaCT
thank you very much expecting the correct version @NobodyWHU
@duanyu has fixed the mistake, thank you very much! @YaYaCT
作者您好: 1.想问一下关于word_similarity矩阵,每一行代表词与对应的相似词的相似度,那么这个矩阵中需要包含这个词与自己的相似度吗,看作者的文章里面说到,相似词矩阵里面包含本身这个词。 2.请问这个代码的第132行j和i的位置是不是反了,因初学,不太理解,见笑了 希望能得到作者的回复,谢谢!
2018-05-07 10:25 GMT+08:00 Jeremy Wang notifications@github.com:
@duanyu https://github.com/duanyu has fixed the mistake, thank you very much! @YaYaCT https://github.com/YaYaCT
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NobodyWHU/GPUDMM/issues/3#issuecomment-386940164, or mute the thread https://github.com/notifications/unsubscribe-auth/AlMzE7g0NpayJEmcrwWEGEsL6ghF2wz0ks5tv7CXgaJpZM4QrFKT .
@YaYaCT word_similarity 应该是个对称矩阵, [i,j] 和 [j,i] 的值应该是一样的,可能写成[i,j]更容易理解。单词与自己的相似度应该是1.
what is the format of the word_similarity.txt and the qa_word2id.txt? thank you very much @NobodyWHU @duanyu