dennybritz / cnn-text-classification-tf

Convolutional Neural Network for Text Classification in Tensorflow
Apache License 2.0
5.64k stars 2.77k forks source link

Can this model apply to Chinese text classification? #99

Closed dyllanwli closed 3 years ago

dyllanwli commented 7 years ago

Hi, I am very interesting in NLP on neural network, so is there possible to use this model to do some Chinese text classification?

stayrea1 commented 7 years ago

of course you can.

Psycho7 commented 6 years ago

Yes I did the same thing few months ago. It turned out good using Sougou news dataset. However I also noticed someone stating the structure should be modified in order to get higher accuracy. But I didn't try that since I've already got what I want.

calmzealA commented 6 years ago

@Psycho7 hi,关于搜狗的新闻数据集,有两个疑问:

  1. 数据集带分词吗?
  2. 积极和消极的标签,是怎样打上去的呢?
xwj0813 commented 6 years ago

@Psycho7 关于中文分词我也想请教一下标签是如何判定的?谢谢

Psycho7 commented 6 years ago

@calmzealA Sorry for my late response. Also to @xwj0813.

I didn't directly use Sougou dataset, what I used is the one has been processed by @gaussic . But it seems like he already deleted that blog I read before. But you can still have a look at this repo, it may help you a lot.

Ok let me answer your questions now.

数据集带分词吗?

No. But you can very easily do text segmentation. I used jieba and the result is pretty good.

积极和消极的标签,是怎样打上去的呢?

You don't have to still do positive vs. negative. Since news are classed as Sports, Political, etc., you can regard their categories as labels, which is what most of people do.