Closed jhyuklee closed 4 years ago
Hey @jhyuklee,
when you run run_experiments.py some of the datapoint will be filtered out. Look here https://github.com/facebookresearch/LAMA/blob/master/scripts/batch_eval_KB_completion.py#L223 You should get the same number of datapoints as we report in the paper once you run the script.
Ciao, Fabio
Hi, thank you for open-sourcing this great project!
I looked into the datasets provided in this repository (https://dl.fbaipublicfiles.com/LAMA/data.zip) and some of their sizes do not match with the sizes described in the paper.
ConceptNet: 11458 (paper) vs 29774 (dataset) Google-RE death-place: 765 (paper) vs 766 (dataset)
Also for the TREx dataset, could you explain how the sentences are selected from the 'evidences' in each line of jsonl file? There seems to be multiple 'masked_sentence' in 'evidences'.
Thank you.