Slight result difference

wead-hsu commented 2 years ago

Hi, meng. I was recently rerunning the code to reproduce the results.

Appreciate your wonderful code. I successfully reproduced the results in the kinship and umls datasets. However, the results on the wn18rr seem inconsistent (rnnlogic w emb: 7261, 0.45, 0.42, 0.53). I wonder if i made something wrong.

I did the following to run the experiment:

install cppext
CUDA_VISIBLE_DEIVCES=0 python run.py ../data/wn18rr/ ./workspace/wn18rr False 0 1 RotatE_200
calculate result from the train_log.txt by summarizing the results (select best valid result)

And I also found the same problem on the FB15k-237 dataset. Could you please give some help?

mnqu commented 2 years ago

Thanks Wead! I have forwarded the issue to Junkun, the co-first author who implemented rnnlogic w emb. He will figure it out.

Meng

wead-hsu commented 2 years ago

Thanks Wead! I have forwarded the issue to Junkun, the co-first author who implemented rnnlogic w emb. He will figure it out.

Meng

Thanks a lot. I am conducting some experiments based on this repo. If I get some positive results, I'd like to share my idea and look forward to your suggestion!

wead-hsu commented 2 years ago

Thanks Wead! I have forwarded the issue to Junkun, the co-first author who implemented rnnlogic w emb. He will figure it out.

Meng

It seems due to the default hyperparameters after inspecting the log. The predictor_num_epoch is so small that the learning is not adequate.

mnqu commented 2 years ago

It seems due to the default hyperparameters after inspecting the log. The predictor_num_epoch is so small that the learning is not adequate.

This is indeed a problem. The codes were refactorized before release, especially for the predictor, so the optimal hyperparameters might change. We are also trying to tune the hyperparameters to match the results of the previous codes.

immortalCO commented 2 years ago

Hi Wead, Thanks for your comments. In our paper, we use longer rules for WN18RR datasets. Therefore, you should add one line 'max_rule_len': 5, between lines 55 and 71 in run.py to make the model use longer rules. Without this step, the model will only use rules of length no longer than 4, which might significantly reduce the results. For FB15K-237, we notice that there is no pretrained RotatE_500 in the data folder, so it might be unable to train such dataset with the current files. We will upload it soon. Hope you can successfully reproduce the results later. We are looking forward to your nice ideas! Feel free to tell us if you have any other issues.

wead-hsu commented 2 years ago

Hi Wead, Thanks for your comments. In our paper, we use longer rules for WN18RR datasets. Therefore, you should add one line 'max_rule_len': 5, between lines 55 and 71 in run.py to make the model use longer rules. Without this step, the model will only use rules of length no longer than 4, which might significantly reduce the results. For FB15K-237, we notice that there is no pretrained RotatE_500 in the data folder, so it might be unable to train such dataset with the current files. We will upload it soon. Hope you can successfully reproduce the results later. We are looking forward to your nice ideas! Feel free to tell us if you have any other issues.

Thanks for your help!

When changing the max rule length, should I also change the length_time parameter in the rule_sample function to {1: 1, 2: 10, 3: 10, 4: 10, 5: 10}?

For FB15k-237, I actually pre-trained a model using the KnowledgeGraphEmbedding repo with the best configuration and move it to the data directory. Is that OK?

I have modified the code and am running the experiments following your suggestions.

By the way, I noticed that the relation_embedding is reset after training for validation (line 719-721 in model_rnnlogic.py), does it matter for model performance?

Sincerely. Weidi Xu

wead-hsu commented 2 years ago

Hi Wead, Thanks for your comments. In our paper, we use longer rules for WN18RR datasets. Therefore, you should add one line 'max_rule_len': 5, between lines 55 and 71 in run.py to make the model use longer rules. Without this step, the model will only use rules of length no longer than 4, which might significantly reduce the results. For FB15K-237, we notice that there is no pretrained RotatE_500 in the data folder, so it might be unable to train such dataset with the current files. We will upload it soon. Hope you can successfully reproduce the results later. We are looking forward to your nice ideas! Feel free to tell us if you have any other issues.

Thanks for your help!

When changing the max rule length, should I also change the length_time parameter in the rule_sample function to {1: 1, 2: 10, 3: 10, 4: 10, 5: 10}?

For FB15k-237, I actually pre-trained a model using the KnowledgeGraphEmbedding repo with the best configuration and move it to the data directory. Is that OK?

I have modified the code and am running the experiments following your suggestions.

By the way, I noticed that the relation_embedding is reset after training for validation (line 719-721 in model_rnnlogic.py), does it matter for model performance?

Sincerely. Weidi Xu

Hi, I have tried to run the experiments with your suggestion, however, the results remain similar, i.e., mr~7500+. For wn18rr, I have tried to (1) increase the max_rule_length to 5 (2) increase the predictor_num_epoch to 8000 and (3) change to length_time in the rule_sample function to {1:1, 2:10, 3:10, 4:10, 5:10}.

There are many details in the code, I do not know how to tune these parameters. If you have any other suggestions, please tell me.

immortalCO commented 2 years ago

Hi Weidi,

Sorry for the inconvenience. I have found a major problem in our code: the hyperparameters of code run.py are designed for a fast test of small datasets, so it is very small and not suitable for large datasets. For example, it only uses 1000 rules, it only trains for 1000 epochs, and it disables pseudo-groundings, which are two major improvements of our model. When I found that you "increase" the predictor_num_epoch with default value 200000 to 8000, I found that it has been mistakenly overridden in run.py.

I have updated run.py and removed all overrode hyperparameters except rotate_pretrained. Please use the run.py and add max_rule_len: 5, between line 17~21. In this setting, it will use default hypermeters defined in model_rnnlogic.py, line 485, which is suitable for large datasets.

Thank you again for pointing out the problems.

Sincerely, Junkun

wead-hsu commented 2 years ago

Hi Weidi,

Sorry for the inconvenience. I have found a major problem in our code: the hyperparameters of code run.py are designed for a fast test of small datasets, so it is very small and not suitable for large datasets. For example, it only uses 1000 rules, it only trains for 1000 epochs, and it disables pseudo-groundings, which are two major improvements of our model. When I found that you "increase" the predictor_num_epoch with default value 200000 to 8000, I found that it has been mistakenly overridden in run.py.

I have updated run.py and removed all overrode hyperparameters except rotate_pretrained. Please use the run.py and add max_rule_len: 5, between line 17~21. In this setting, it will use default hypermeters defined in model_rnnlogic.py, line 485, which is suitable for large datasets.

Thank you again for pointing out the problems.

Sincerely, Junkun

Thanks for your response. I didn't realize that the pgnd is disabled by previous configurations. I have updated the code and rerun the experiments.

DeepGraphLearning / RNNLogic

Slight result difference #6