structure part - Githubissues

yuanXuX commented 3 years ago

How does the structure learning part mentioned in the paper add up to the loss of the bert classifier?

wangbo9719 commented 3 years ago

Thanks for your attention. We use a triple contrastive objective (Eq.(13) described in section 3.3.1) for structure learning. The final loss to train the StAR is a weighted sum of this loss and the loss derived from BERT/RoBERTa classifier. And this procedure can refer to line 473 of "StAR_KGC/StAR/kbc/models.py".

yuanXuX commented 3 years ago

Thank you for your reply ! I want to reconfirm that compared to KG-BERT, StAR significantly reduces the inference time. The first is because of the bi-encoder structure, and the second is because the dev set is randomly selected (can you further explain the relationship between get_new_dev_dict.py and the paper experiment? )? In KG-BERT, it seems that it takes 30 days to complete the inference of the FB15k237 data set. How much time should be spent on the four data sets in the first stage (select the verification set) and the second stage (train and eval) of StAR model?

wangbo9719 commented 3 years ago

Due to time-consuming validation on original dev set, we respectively selected 50 corrupting entities for the head and tail entities of each triple from dev set through a simple trained model, e.g., BertForPairCls in "/StAR/kbc/models.py". An intuition behind the selections is that, the most challenge negative entities to the previous context-based encoding approach are representative enough to evaluate the model and help select learnable parameters. Hence, we recorded the negative entities with top-50 ranking scores for each dev example (individually for head and tail entities) to compose a new dev set. The running time on dev set is similar to the prediction stage's, and does not be considered into time cost in general. I did not record the training and inference time spent on FB15k-237. I will provide this data to you later.

wangbo9719 / StAR_KGC

structure part #2