Closed shanry closed 6 years ago
Thank you for the interest in our work. Here the answers to your questions.
I hope this helps. Please let me know if you have more questions, or if I can help you further.
First ,Thank you very much for your patient and detailed reply . (usually I got an email when my questions are answered but don't know why this time I didn't get the email so It is surprise for me to see your comment today )
Second, I have been using your code trying my new model recently for link prediction and now I got some decent result and here are my new questions :
Again, thanks a lot and looking forward to your answer .
This is true. Some training data will be wasted. Usually, the knowledge graphs are large enough that the difference will not matter much. If you have a batch size of 128 and FB15k-237 you have 149689 1-N samples in the training set which means you will waste on average about 64 samples for similar batch sizes or 0.00043 of the dataset. In this specific case of a batch size of 128, 85 samples are wasted which is 0.00057 of the train set. On smaller datasets, like UMLS you have 1560 1-N samples and on average you have a waste of around 0.041 which is high. You can use for example a batch size of 156 to reduce the waste to 0. I do not think it matters much for large datasets as the fraction is too small, but for small datasets, you definitely want to optimize around this quirk.
Of course you can use part of the code — I would be happy if you do so. I would even encourage it for the ranking code because the ranking procedure is not straight-forward and very error-prone. Many publications about link prediction in knowledge graphs cannot be replicated and the crux of the issue might be that they wrote their own code and got the evaluation function wrong. Here the ranking implementation has been tested not only by my co-authors, but also externally by other researchers and I think the implementation is correct.
Let me know if you have more questions.
OK. now some of my doubts are cleared. I am still working on my model but it seems it is hard to tune it to surpass your convE with respect to MRR and hit@1 a lot ( but I can always get a much better MR ). I expect to focus on it after August. and hope then we could have some more communications with each other. Thank you.
Sure, you can always send me an email. If you think it could also be beneficial for others, you can always create a new issue here. Thank you.
maybe the questions are a bit more but I am working on a new model on the task , hard , so hope you could give some answer which is a great help for me and I will appreciate it a lot ,thank you ! (please forgive me for my poor English..)