how to get the corpus like 'Financial News 金融新闻'?

Embedding / Chinese-Word-Vectors

100+ Chinese Word Vectors 上百种预训练中文词向量

Apache License 2.0

11.79k stars 2.32k forks source link

how to get the corpus like 'Financial News 金融新闻'? #43

Open SeekPoint opened 5 years ago

shenshen-hungry commented 5 years ago

We list corpus in the Corpus, which can be downloaded. The rest of corpus are grabbed from the Internet via a scrapy. Because of the copyright, if they are released, we could face some legal risks. I'm very sorry for that. But with a simple scrapy, it is easy to get a mount of data in several days.