richliao / textClassifier

Text classifier for Hierarchical Attention Networks for Document Classification
Apache License 2.0
1.07k stars 379 forks source link

pre-processing issue #17

Open jhoh10 opened 7 years ago

jhoh10 commented 7 years ago

With the way the pre-processing is written, it seems to corrupt the validation set. The word index is created using data from both the training and validation sets. When test samples come in, they won't have the same treatment. Does anyone else agree?