ramesesz / master-thesis

0 stars 0 forks source link

Find/create knowledge graph for evaluation dataset #13

Open ramesesz opened 2 months ago

ramesesz commented 2 months ago

First framework we investigated was Wikidatasets. As creating the labels from the full dump took too much time and encountered errors, we took the label from the April 15, 2020 Wikidata dump.

When filtering the labels contained in the subclasses of machine_elements, it only resulted in a dataset of 374 rows, too little for our needs.

import pickle
from wikidatasets.processFunctions import get_subclasses, query_wikidata_dump, build_dataset

path = 'machine_elements/'
labels_path = 'labels/'
labels = pickle.load(open(labels_path + 'labels.pkl', 'rb'))

filtered_label = {key: labels[key] for key in test_entities if key in labels}

Image