kwhitehall / scored

3 stars 6 forks source link

Feature/mich326/extract datasets from abstracts #30

Open mich326 opened 8 years ago

mich326 commented 8 years ago

A rough, rough first draft of some dataset extraction code for our abstracts. We'll refine and upgrade the regex and entity evaluations after looking at some options and documentation.

Any ideas on which tools/heuristics to use for filtering out personal names as well as names of places would be really helpful. That's a big first chunk to get out of the way before further extraction.