DeepGraphLearning / RNNLogic

120 stars 25 forks source link

Some queries regarding code #20

Open ananjan-nandi-9 opened 1 year ago

ananjan-nandi-9 commented 1 year ago

Hi,

After going through this code and the RNNLogic paper, we had a few questions regarding the implementation of the paper uploaded here. In what follows, the "old code" refers to the code inside the codes folder in the repo, which contains implementation for RNNLogic. The "new code" refers to the code inside the RNNLogic+ folder, which contains implementation for both RNNLogic and RNNLogic+.

1) The old code trains separate models for each relation. This implies that all the EM iterations for each relation are done one after the other. There is a common rule generator for all relations, but separate predictors are trained for each relation serving as a rule head. On the other hand, the new code trains on all the relations together for both the RNNLogic and RNNLogic+ portion of the model. There is a common rule generator as well as predictor for all relations. The training data used to train these models have interspersed relations. Could you please confirm if these approaches are equivalent?

2) Different optimization equations have been used to train the RNNLogic predictor in the old and new code. The old code appears to be using a type of negative sampling (line 634 - 644 in model_rnnlogic.py). On the other hand, the new code uses a different optimization equation (line 86- 96 in trainer.py). Since the loss functions are different in both cases, can we expect the same behavior from both versions of the code?

3) In the old code, several passes are made over the training data while training the predictor, after which the top k rules with best scores according to the H(rule) metric are passed to the generator. The generator is then trained to generate these rules. On the other hand, in the RNNLogic implementation used in the new code, only one pass is made over the training data to train the predictor in a given EM iteration. Moreover, the H(rule) is computed for all rules and used to train the generator, which trains on all these rules. Could you please confirm if these approaches are equivalent?

4) Could you please tell us the significance of the pseudo-groundings used in the old code? The paper does not contain details regarding the advantage of using them, and the new code has not used them either. Is it simply a heuristic for generating a score in case no groundings are found using the rules generated by the generator; given a query head and rule? Do you have results for how the presence and absence of pseudo-groundings in the code can affect the final performance of the model?

5) The code for RNNLogic+ with embeddings (which produces best results as reported in the paper) is not present in the Github repository. Both RotatE scores and the knowledge graph embeddings used in the scoring function for RNNLogic+ have not been implemented. Would you be releasing the code for this version of the model in the future?

6) Could you also provide the specifications of the environment in which you performed your experiments (with training times)? Our GPUs are running out of memory when we try to run the old code with pseudo-groundings turned on. It would also be helpful if you could provide us with trained models (dumped into a pickle or even the final .pth files in the workspace folder generated after a run of the old code) or the final set of high-quality rules generated after training RNNLogic on the datasets mentioned in the paper with optimal hyperparameters.

We are looking forward to your response to the queries mentioned above. Thanks in advance!

Regards, Ananjan Nandi Navdeep Kaur