jjacampos / FeedbackWeightedLearning

MIT License
2 stars 0 forks source link

deployment.csv file missing #1

Closed pavlostheodorou closed 3 years ago

pavlostheodorou commented 3 years ago

Does the deployment.csv file (from https://www.kaggle.com/danofer/dbpedia-classes) is the DBP_wiki_data.csv file?

jjacampos commented 3 years ago

The deployment.csv file has to be generated using the indexes from: https://github.com/jjacampos/FeedbackWeightedLearning/blob/master/data/doc_class_splits/deployment_indexes.txt.

The original file is: https://www.kaggle.com/danofer/dbpedia-classes?select=DBPEDIA_train.csv

Then indexes from https://github.com/jjacampos/FeedbackWeightedLearning/blob/master/data/doc_class_splits/train_indexes.txt are used for training and https://github.com/jjacampos/FeedbackWeightedLearning/blob/master/data/doc_class_splits/deployment_indexes.txt for deployment.

pavlostheodorou commented 3 years ago

So, in order to run the code, if I understood correctly (correct me if I am wrong) I have to run a script (that is not implemented in the current project) to extract (from the original file:DBPEDIA_train.csv) the train.csv and deployment.csv files with indexes train_indexes.txt and deployment_indexes.txt respectively?

jjacampos commented 3 years ago

Yes, you are right. We didn't want to upload a modified dataset as we don't own it.