First framework we investigated was Wikidatasets. As creating the labels from the full dump took too much time and encountered errors, we took the label from the April 15, 2020 Wikidata dump.
When filtering the labels contained in the subclasses of machine_elements, it only resulted in a dataset of 374 rows, too little for our needs.
import pickle
from wikidatasets.processFunctions import get_subclasses, query_wikidata_dump, build_dataset
path = 'machine_elements/'
labels_path = 'labels/'
labels = pickle.load(open(labels_path + 'labels.pkl', 'rb'))
filtered_label = {key: labels[key] for key in test_entities if key in labels}
First framework we investigated was Wikidatasets. As creating the labels from the full dump took too much time and encountered errors, we took the label from the April 15, 2020 Wikidata dump.
When filtering the labels contained in the subclasses of machine_elements, it only resulted in a dataset of 374 rows, too little for our needs.