Open johnp2266 opened 6 years ago
Take a look at the file data_helpers.py There is a method: load_data_and_labels(positive_data_file, negative_data_file), which returns the data and labels.
You can add a new method, where you read in your own data. Then turn your labels into 1-hot encoded form and return it. In eval.py change code to call your own new method.
Something like
def load_multilabel_data_and_labels():
# add here code to read in your own data
# Turn your labels into 1-hot-encoded form
# Here 5 classes, so we return a an integer vector length 5. [1,0,0,0,0]
# where only one of them is 1, rest 0.
# Meaning [class1=1, class2=0, class3=0...]
nr_classes = enumerate(labels)
nr_lines = len(labels)
new_labels = np.zeros(nr_lines,nr_classes))
for labelnr, value in enumerate(labels):
if value[0]==1:
new_labels[labelnr][0]=1 #one hot to true
elif value[0]==0.7:
new_labels[labelnr][1]=1
elif value[0]==0.5:
new_labels[labelnr][2]=1
elif value[1]==0.7:
new_labels[labelnr][3]=1
elif value[0]==0:
new_labels[labelnr][4]=1
x_text = new_texts
y = new_labels
return [x_text, y]
In eval.py, find this and replace with your new method:
x_text, y = data_helpers.load_data_and_labels(FLAGS.positive_data_file, FLAGS.negative_data_file)
Is this enough ? Do I need to change the definition of the model?
Hi,
I'm new to this. I have 7 different types of data for my dataset. I'm not sure what to change in eval.py to load my dataset?