Questions about data - Githubissues

crownpku / Information-Extraction-Chinese

Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取

2.22k stars 814 forks source link

Hi: I've read the description about the corpus in your blog. But I still have some questions about it. (1). It seems that you haven't consider to reduce the noises in datasets which generated by distant supervision. Have you ever use any priori knowledge to handle this datasets. Or do these two attentions on characters and sentences can reduce the noises? (2).Do the total train datasets consist of the train.txt including 1000 sentences in your Github and the open source project Roshanson/TextInfoExp including 89183 sentences? Hope to get your reply Thank you in advance!

crownpku / Information-Extraction-Chinese

Questions about data #69