harmonydata / harmony

The Harmony Python library: a research tool for psychologists to harmonise data and questionnaire items. Open source.
https://harmonydata.ac.uk
MIT License
7 stars 12 forks source link

Refactor PDF extraction to not use Spacy #11

Closed woodthom2 closed 1 week ago

woodthom2 commented 6 months ago

See training data in https://github.com/harmonydata/pdf-questionnaire-extraction

woodthom2 commented 1 month ago

See #39, this has partly been done

woodthom2 commented 1 week ago

Switched to Sklearn CRFSuite