RTIInternational / gobbli

Deep learning with text doesn't have to be scary.
Apache License 2.0
275 stars 23 forks source link

Add helper module for exploratory descriptives #7

Open jasonnance opened 5 years ago

jasonnance commented 5 years ago

Feature

Implement best-effort support for some descriptive stats commonly applied to text data -- keyword/n-gram counts, typical document length, distribution of classes/labels, etc.

Motivation

Helpful for people exploring a new dataset before deciding what they want to do with it or determining what kind of domain-specific preprocessing may be necessary.

Additional Details