gaussic / text-classification-cnn-rnn

CNN-RNN中文文本分类,基于TensorFlow
MIT License
4.16k stars 1.47k forks source link

Hello,刷自己的数据集出现的问题 #138

Closed JMF141114 closed 4 years ago

JMF141114 commented 4 years ago

Traceback (most recent call last): File "run_cnn.py", line 199, in train() File "run_cnn.py", line 80, in train x_train, y_train = process_file(train_dir, word_to_id, cat_to_id, config.seq_length) File "E:\Pycharm2017\毕设文本分类\text-classification-cnn-rnn-master\data\cnews_loader.py", line 107, in process_file label_id.append(cat_to_id[labels[i]]) KeyError: '一般缺陷 '

做数据集的时候分类标签与内容之间是一个制表符\t,但是总是出现问题诶

swagglian commented 4 years ago

Traceback (most recent call last): File "run_cnn.py", line 199, in train() File "run_cnn.py", line 80, in train x_train, y_train = process_file(train_dir, word_to_id, cat_to_id, config.seq_length) File "E:\Pycharm2017\毕设文本分类\text-classification-cnn-rnn-master\data\cnews_loader.py", line 107, in process_file label_id.append(cat_to_id[labels[i]]) KeyError: '一般缺陷 '

做数据集的时候分类标签与内容之间是一个制表符\t,但是总是出现问题诶

请问你后来是怎么解决的

JMF141114 commented 4 years ago

Traceback (most recent call last): File "run_cnn.py", line 199, in train() File "run_cnn.py", line 80, in train x_train, y_train = process_file(train_dir, word_to_id, cat_to_id, config.seq_length) File "E:\Pycharm2017\毕设文本分类\text-classification-cnn-rnn-master\data\cnews_loader.py", line 107, in process_file label_id.append(cat_to_id[labels[i]]) KeyError: '一般缺陷 ' 做数据集的时候分类标签与内容之间是一个制表符\t,但是总是出现问题诶

请问你后来是怎么解决的

用notepad++对照原数据集重新刷一下数据集吧,可能数据集某个地方出错了

swagglian commented 4 years ago

Traceback (most recent call last): File "run_cnn.py", line 199, in train() File "run_cnn.py", line 80, in train x_train, y_train = process_file(train_dir, word_to_id, cat_to_id, config.seq_length) File "E:\Pycharm2017\毕设文本分类\text-classification-cnn-rnn-master\data\cnews_loader.py", line 107, in process_file label_id.append(cat_to_id[labels[i]]) KeyError: '一般缺陷 ' 做数据集的时候分类标签与内容之间是一个制表符\t,但是总是出现问题诶

请问你后来是怎么解决的

用notepad++对照原数据集重新刷一下数据集吧,可能数据集某个地方出错了

谢谢你 我昨天已解决 是数据集的问题 、