brightmart / text_classification

all kinds of text classification models and more with deep learning
MIT License
7.83k stars 2.57k forks source link

哪里可以下载 test-zhihu-forpredict-v4only-title.txt ,请分享一下,谢谢 #84

Open wujx2018 opened 5 years ago

wujx2018 commented 5 years ago

test-zhihu-forpredict-v4only-title.txt ,请分享一下,谢谢

Amanda2024 commented 5 years ago

同求

brightmart commented 5 years ago

check this session on README.md

Sample data: cached file

to help you run this repository, currently we re-generate training/validation/test data and vocabulary/labels, and saved

them as cache file using h5py. we suggest you to download it from above link.

it contain everything you need to run this repository: data is pre-processed, you can start to train the model in a minute.

it's a zip file about 1.8G, contains 3 million training data. although after unzip it's quite big, but with the help of

hdf5, it only need a normal size of memory of computer(e.g.8 G or less) during training.

we use jupyter notebook: pre-processing.ipynb to pre-process data. you can have a better understanding of this task and

data by taking a look of it. you can also generate data by yourself in the way your want, just change few lines of code

using this jupyter notebook.

If you want to try a model now, you can dowload cached file from above, then go to folder 'a02_TextCNN', run

python p7_TextCNN_train.py it will use data from cached files to train the model, and print loss and F1 score periodically.

old sample data source: if you need some sample data and word embedding per-trained on word2vec, you can find it in closed issues, such as: issue 3.

you can also find some sample data at folder "data". it contains two files:'sample_single_label.txt', contains 50k data

with single label; 'sample_multiple_label.txt', contains 20k data with multiple labels. input and label of is separate by " label".

if you want to know more detail about data set of text classification or task these models can be used, one of choose is below:

luhawk803 commented 5 years ago

Could you give more hint, I really want to use your open source project. But documentation really hard to understand. @brightmart

yoonjae5 commented 5 years ago

我也碰到了这个问题,不知咋解决

Zhonghui commented 4 years ago

一样的问题,有人解决了吗?

yoonjae5 commented 4 years ago

我忘了怎么弄的了,他的issue里面应该有

------------------ 原始邮件 ------------------ 发件人: "Zhonghui"<notifications@github.com>; 发送时间: 2019年12月30日(星期一) 中午11:58 收件人: "brightmart/text_classification"<text_classification@noreply.github.com>; 抄送: "神起の爱"<136635867@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [brightmart/text_classification] 哪里可以下载 test-zhihu-forpredict-v4only-title.txt ,请分享一下,谢谢 (#84)

一样的问题,有人解决了吗?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Zhonghui commented 4 years ago

看了issue3里面的链接,都下了,没找到。

我忘了怎么弄的了,他的issue里面应该有 ------------------ 原始邮件 ------------------ 发件人: "Zhonghui"<notifications@github.com>; 发送时间: 2019年12月30日(星期一) 中午11:58 收件人: "brightmart/text_classification"<text_classification@noreply.github.com>; 抄送: "神起の爱"<136635867@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [brightmart/text_classification] 哪里可以下载 test-zhihu-forpredict-v4only-title.txt ,请分享一下,谢谢 (#84) 一样的问题,有人解决了吗? — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

qiaobaDu commented 3 years ago

所有的文件夹都没有找到数据,请问有找到的嘛