missing a few training data ?

ruidan / Aspect-level-sentiment

Code and dataset for ACL2018 paper "Exploiting Document Knowledge for Aspect-level Sentiment Classification"

Apache License 2.0

148 stars 32 forks source link

missing a few training data ? #3

Closed howardhsu closed 5 years ago

howardhsu commented 5 years ago

Thanks for sharing. From the preprocessed data, I realized the counts of examples (from my script) are not the same as reported in the paper.

For example, the training data of SemEval 2014 is like this: lt Counter({'positive': 987, 'negative': 866, 'neutral': 460}) res Counter({'positive': 2164, 'negative': 805, 'neutral': 633})

Did I make any mistake?

ruidan commented 5 years ago

The initial SemEval 2014 dataset contains a few more training examples in both res and lt. But later they changed the training set with just a few examples removed. Now we can only download the later version. In fact, I also used the later version of training set same as yours (the link is given in README). I guess when I wrote the paper, I didn't realise that and copied the data statistics table from some SemEval14 papers.

howardhsu commented 5 years ago

Thanks for your explaination.