Training on my own data set

omaryashraf5 commented 6 years ago

Hi, I am reproducing your code to train LSTM, GRU and GAN based models on my own data. I have created two files one that contains positive text and the other contains negative text. I was wondering if I needed to perform an additional preprocessing step as I get an error at this line of the datahelpers.py file: positive_examples = list(open(positive_data_file, "r", encoding="UTF8").readlines())

where my text is not UTF8 encoded. Could I just remove the encoding field or do I need to perform an additional preprocessing step on my data?

Thanks!

Omar

omaryashraf5 commented 6 years ago

I figured it out! It was ISO-8859-1

roomylee commented 6 years ago

@omaryashraf5 Good Luck 👍

roomylee / rnn-text-classification-tf

Training on my own data set #2