Closed chocolate9624 closed 2 years ago
Hi, I have updated the preprocess code to download the candidate set. Please check here.
Could you give more details about how to generate the candidates of valid data?
valid_url = "https://snap.stanford.edu/smore/valid.pt" I see the comments in code: "# Specifically designed for OGB-LSC WikiKG v2. Since no candidates are provided by the original dataset, we generate candidates based on heuristics such as degrees / entity types." But I can't reproduce this, and the candidates of per relation is difference(about < 1%).
In my opinion, the logistic is train_data.groupby('relation')['tail'].apply(lambda grp: list(grp.value_counts().nlargest(20000).index)) right?
Hi, I noticed the evaluation data is loaded in this line https://github.com/google-research/smore/blob/5b1a8a00b0cbfa024f411fc080b3d46dc681edd8/smore/training/main_train.py#L212 for WikiKG.
However, I cannot find the code for generating candidate tail entities during evaluation. Can you show more details?
Thanks a lot!