11693-2 / project-team02

Software Method Team Project
4 stars 1 forks source link

Annotators for Question Extraction #11

Open xiliuhk opened 9 years ago

xiliuhk commented 9 years ago

Question

For the user’s questions, retrieve a ranked list of named entities sorted according to relevance, analyze the question to determine which type of entity need to be retrieved.

QuestionVectorAnnotator

We need to use this annotator to extract tokens from query text, since we can not simply search query sentence in database.

This annotator processes query text in three steps:

The original query sentence may contains punctuation like comma, period, column. And different forms of a same word, as "is" and "was". All of these will influence the accuracy of our returned result.

This time, we used:

For next step, we may seek for better solution.

Extract keywords from sentence is difficult. However, there are provided APIs which can help us.
MeSH service from API can return those most related keywords. For example, for question "Is Rheumatoid Arthritis more common in men or women?", after we call getKeywords() function, it will return "Is Rheumatoid Arthritis more common in men women". It also provides a list of related keywords, which may be useful in the improvement of accuracy in future.

xiliuhk commented 9 years ago

function: getKeywords()