seshadrs / hw2-team16

Software Engineering Team Project (Team 16). A 'UIMA' based Natural Language Question Answering pipeline for the Biotech domain.
0 stars 4 forks source link

Reverse Mapping of Offsets #34

Closed seshadrs closed 11 years ago

seshadrs commented 11 years ago

Mapping of offsets from raw text back to HTML needs to be better than what it is right now (a bad heuristic). The low MAP score (~11% Doc level) must be because of this. Need to check for every word token from right and left, and find possible boundaries in HTML.

seshadrs commented 11 years ago

Used exhaustive recursive search of possible boundaries. Too slow for use. So, pruned seach space with a heuristic approach.