Closed pavlostheodorou closed 3 years ago
The deployment.csv file has to be generated using the indexes from: https://github.com/jjacampos/FeedbackWeightedLearning/blob/master/data/doc_class_splits/deployment_indexes.txt.
The original file is: https://www.kaggle.com/danofer/dbpedia-classes?select=DBPEDIA_train.csv
Then indexes from https://github.com/jjacampos/FeedbackWeightedLearning/blob/master/data/doc_class_splits/train_indexes.txt are used for training and https://github.com/jjacampos/FeedbackWeightedLearning/blob/master/data/doc_class_splits/deployment_indexes.txt for deployment.
So, in order to run the code, if I understood correctly (correct me if I am wrong) I have to run a script (that is not implemented in the current project) to extract (from the original file:DBPEDIA_train.csv) the train.csv and deployment.csv files with indexes train_indexes.txt and deployment_indexes.txt respectively?
Yes, you are right. We didn't want to upload a modified dataset as we don't own it.
Does the deployment.csv file (from https://www.kaggle.com/danofer/dbpedia-classes) is the DBP_wiki_data.csv file?