chartbeat-labs / textacy

NLP, before and after spaCy
https://textacy.readthedocs.io
Other
2.21k stars 249 forks source link

Refactor datasets #234

Closed bdewilde closed 5 years ago

bdewilde commented 5 years ago

Description

Motivation and Context

Lots of weirdness and inconsistency within and among the datasets, and a general dissatisfaction with the data structure for records.

How Has This Been Tested?

All tests pass! Even the new ones.

Types of changes

Checklist: