Closed dyllanwli closed 3 years ago
of course you can.
Yes I did the same thing few months ago. It turned out good using Sougou news dataset. However I also noticed someone stating the structure should be modified in order to get higher accuracy. But I didn't try that since I've already got what I want.
@Psycho7 hi,关于搜狗的新闻数据集,有两个疑问:
@Psycho7 关于中文分词我也想请教一下标签是如何判定的?谢谢
@calmzealA Sorry for my late response. Also to @xwj0813.
I didn't directly use Sougou dataset, what I used is the one has been processed by @gaussic . But it seems like he already deleted that blog I read before. But you can still have a look at this repo, it may help you a lot.
Ok let me answer your questions now.
数据集带分词吗?
No. But you can very easily do text segmentation. I used jieba and the result is pretty good.
积极和消极的标签,是怎样打上去的呢?
You don't have to still do positive vs. negative. Since news are classed as Sports, Political, etc., you can regard their categories as labels, which is what most of people do.
Hi, I am very interesting in NLP on neural network, so is there possible to use this model to do some Chinese text classification?