Implement best-effort support for some descriptive stats commonly applied to text data -- keyword/n-gram counts, typical document length, distribution of classes/labels, etc.
Motivation
Helpful for people exploring a new dataset before deciding what they want to do with it or determining what kind of domain-specific preprocessing may be necessary.
Feature
Implement best-effort support for some descriptive stats commonly applied to text data -- keyword/n-gram counts, typical document length, distribution of classes/labels, etc.
Motivation
Helpful for people exploring a new dataset before deciding what they want to do with it or determining what kind of domain-specific preprocessing may be necessary.
Additional Details