Sujit-O / pykg2vec

Python library for knowledge graph embedding and representation learning.
MIT License
606 stars 111 forks source link

Hyperparameter settings for FB15K_237 #173

Open Diqingling opened 4 years ago

Diqingling commented 4 years ago

Are there any other hyperparameter settings for different datasets to reproduce their performance in the original paper,such as FB15K_237? I tried to use the hyperparameters for FB15K to train FB15K_237 but the results are far below the benchmarks. The process of adjusting parameters is very time-consuming so I want to ask if anyone can provide some settings. Thanks.

baxtree commented 4 years ago

Hi, @Diqingling. Looks like the original FB15K_237 paper has no code published and it does not mention things like learning rate for model E and DistMult they used either. In this case, we encourage users to explore hyperparameter tuning and you can find how-to in here. Meanwhile, we are extracting the golden hyperparameters for ConvKB which also used FB15K_237 and of course, nothing stops you from extracting them yourself and create your own custom yaml file.

Diqingling commented 4 years ago

Thank you. I’ll try to tune the hyperparameters. Another question is that since the hyperparameters in the yaml files don’t fit dataset FB15K_237, I used them to train models with FB15K as well. However, the filtered hit-10 result of TransE is around 60.2% ,the benchmark given in RotatE is 74.9%. So I want to ask if there is something wrong with my training process or with the hyperparameters settings?

baxtree commented 4 years ago
Probably there was nothing wrong with your training process coz the best result we can get so far is 60.7%: Model MR/FMR MRR/FMRR HIT-1 HIT-3 HIT-5 HIT-10
TransE 215.1193/77.7292 0.1921/0.3413 0.0852/0.2056 0.2176/0.4050 0.3003/0.4921 0.4206/0.6070

TransE is one of the simplest models and if there is an implementation bug, it shall be easy to spot. Can you compare hyperparameters set in TransE.yaml with those displayed in your "Global Setting" log during the training?

Diqingling commented 4 years ago

I checked the hyperparameters in the log file and they match with the settings in TransE.yaml. The difference between 60.2% and 60.7% may be normal volatility during training process. However they are both far below the results given in HoLE paper ,where the filtered hit-10 is 74.9%. Does it mean the hyperparameters in TransE.yaml may not be golden hyperparameters? What’s more ,how do you get the hyperparameters in the exsisting yaml files? Using hyperparameter tuning module in the project or taking them from the original model papers directly?

baxtree commented 4 years ago

All hyperparameters are derived from the original paperswithcode and there might be a slim chance that some authors later corrected errors in their published work, which have not been incorporated in pykg2vec. For the previous tensorflow implementation, you can see HIT-10 is 66% from that table. Hi, @louisccc, would you like to weigh in and share some history about this?

louisccc commented 4 years ago

Hi Diqingling, I think some mini adjustments to the hyperparameters can further boost transe's performance for sure! However, when I was implementing TransE, I followed the paper implementation and added in the hyperparameters mentioned in the paper. The current TransE's performance aligns with their reported result.

In HoLE, they want to compare their methods using TransE as a baseline, so as mentioned in their paper they will try to align the training framework to theirs as far as possible. also, I checked their (HoLE) open-source implementation it shows that hyperparameters used are slightly different with Transe's original authors.

I am not sure if It will be good if we include both original authors setting and hole paper's setting all at once in pykg2vec. Any idea?