Closed howardhsu closed 5 years ago
The initial SemEval 2014 dataset contains a few more training examples in both res and lt. But later they changed the training set with just a few examples removed. Now we can only download the later version. In fact, I also used the later version of training set same as yours (the link is given in README). I guess when I wrote the paper, I didn't realise that and copied the data statistics table from some SemEval14 papers.
Thanks for your explaination.
Thanks for sharing. From the preprocessed data, I realized the counts of examples (from my script) are not the same as reported in the paper.
For example, the training data of SemEval 2014 is like this: lt Counter({'positive': 987, 'negative': 866, 'neutral': 460}) res Counter({'positive': 2164, 'negative': 805, 'neutral': 633})
Did I make any mistake?