chartbeat-labs / textacy

NLP, before and after spaCy
https://textacy.readthedocs.io
Other
2.22k stars 250 forks source link

Add basic text data augmentation functionality #268

Closed bdewilde closed 5 years ago

bdewilde commented 5 years ago

Description

Motivation and Context

I've been training spaCy TextCategorizer s on datasets that are too small, and data augmentation is a great way to improve model performance.

How Has This Been Tested?

Lots of manual validation and trial-by-error. Wrote some tests, and they pass (mostly...).

Types of changes

Checklist: