benywon / ChineseBert

This is a chinese Bert model specific for question answering
27 stars 8 forks source link

about train #1

Open helloword12345678 opened 5 years ago

helloword12345678 commented 5 years ago

thanks for sharing: in the readme you say "Data: 200m chinese internet question answering pairs. Vocab: 52777, jieba CWS enhanced with forward maximum matching." so you train process is just pair of <question,answer>,so this train process is different with squad which <passage,question,answer>, because you don't have passage and this is not a span extraction problem! if i said is right,you train is two input sentence similarity probelm? can you give some explanation

benywon commented 5 years ago

exactly! As the main focus of BERT is to pre-train the model, the model is essentially sentence-level. However, you can build your own model upon BERT to perform your task-specific applications, such as machine comprehension, question answering, sentence classification etc.