shuzi / insuranceQA

A question answering corpus in insurance domain
450 stars 169 forks source link

InsuranceQA Corpus

This dataset is provided as is and for research purpose only. If you publish anything using this data, please cite our paper: Applying Deep Learning to Answer Selection: A Study and An Open Task Minwei Feng, Bing Xiang, Michael R. Glass, Lidan Wang, Bowen Zhou ASRU 2015

Introduction

Format

Corpus Statistics

Question Answer Question Running Words
Train 12,889 21,325 107,889
Valid 2,000 3354 16,931
Test 2,000 3308 16,815

There are totally 27,413 answers (answer set size is 27,413) with the 3,065,492 running words of answers.