INK-USC / RE-Net

Recurrent Event Network: Autoregressive Structure Inference over Temporal Knowledge Graphs (EMNLP 2020)
http://inklab.usc.edu/renet/
435 stars 95 forks source link

Multi-step inference over time in the valid data #14

Closed Lee-zix closed 5 years ago

Lee-zix commented 5 years ago

In the section 2.4 in your paper, "the encoder state is updated based on current predictions, and will be used for making next predictions, That is, for each time step we rank the candidate entities and select top-m entities as current predictions. We maintain the history as a sliding window of length k, so the oldest interaction set will be detached and new predicted entity set will be added to the history." In get_history.py, the generated train_history_ob1.txt, dev_history_ob1.txt, test_history_ob1.txt, the s_hist and o_hist always saved the ground truth history. In my opinion, the history you used to predict results in valid data consist of two parts, 1. the ground truth history in the training data, 2.the prediction for every sample happened before current valid sample in the valid data. while in your code(line 216~224 in model.py) if len(self.s_hist_test[s][r]) == 0: self.s_hist_test[s][r] = s_hist.copy() s_history = self.s_hist_test[s][r] if self.s_hist_test is the generated history, the codes mean: if there is no generated history, s_hist_test equals to the ground truth history, once generated history exists, you only use the generated history to predict the result. I am confused about the codes, if the s_hist is the ground truth history, following your paper, the codes should replace the ground truth history in the valid data with the outputs of your model?? Look forward to your soonest reply!

woojeongjin commented 5 years ago

The following condition means (s,r) pair is first seen in the valid (or test) set after training. "if len(self.s_hist_test[s][r]) == 0" If the condition is satisfied, we use the ground truth history which is from training dataset. As you know the training dataset is given, and we assume that we know the histories in the training set. So we used the ground truth history from training set if the pair is first seen in the valid (or test) set. dev_history_ob1.txt, test_history_ob1.txt contain ground truth history from training set if the pair is first seen after training.

Thanks for the question!

Lee-zix commented 5 years ago

Thanks very much for your timely reply, yes! if there is no generated history(the pair is first seen in the valid data), s_hist_test equals to the ground truth history from training set. But i still have one questions. If the pair is not first seen in the valid data, s_history = self.s_hist_test[s][r] means that you only use the generated history and ignore the the ground truth history from training set?

woojeongjin commented 5 years ago

Yes. If the pair is not first seen in the valid data, then the condition is not satisfied. Thus, we use the generated history and ignore the ground truth history.

Lee-zix commented 5 years ago

This is somewhat different from what I think, and a little different with the last sentence: 'the oldest interaction set will be detached and new predicted entity set will be added to the history.' in section 2.4. Thanks very much for your reply!

woojeongjin commented 5 years ago

The last sentence is for updating histories for each pair. Each pair has history buckets where each bucket hold interactions at each time. When a pair is first seen in validation or testing the buckets are filled with ground truth histories. Then whenever the pair predicts new interactions at each time, the oldest interaction set will be detached and new predicted interaction set will be added. Thanks!