more text stats, consistent doc extensions, better packaging
New and Changed
Refactored and extended text statistics functionality (PR #350)
Added functions for computing measures of lexical diversity, such as the clasic Type-Token-Ratio and modern Hypergeometric Distribution Diversity
Added functions for counting token-level attributes, including morphological features and parts-of-speech, in a convenient form
Refactored all text stats functions to accept a Doc as their first positional arg, suitable for use as custom doc extensions (see below)
Deprecated the TextStats class, since other methods for accessing the underlying functionality were made more accessible and convenient, and there's no longer need for a third method.
Standardized functionality for getting/setting/removing doc extensions (PR #352)
Now, custom extensions are accessed by name, and users have more control over the process:
>>> import textacy
>>> from textacy import extract, text_stats
>>> textacy.set_doc_extensions("extract")
>>> textacy.set_doc_extensions("text_stats.readability")
>>> textacy.remove_doc_extensions("extract.matches")
>>> textacy.make_spacy_doc("This is a test.", "en_core_web_sm")._.flesch_reading_ease()
118.17500000000001
Moved top-level extensions into spacier.core and extract.bags
Standardized extract and text_stats subpackage extensions to use the new setup, and made them more customizable
Improved package code, tests, and docs
Fixed outdated code and comments in the "Quickstart" guide, then renamed it "Walkthrough" since it wasn't actually quick; added a new and, yes, quick "Quickstart" guide to fill the gap (PR #353)
Added a pytest conftest file to improve maintainability and consistency of unit test suite (PR #353)
Improved quality and consistency of type annotations, everywhere (PR #349)
Note: Bumped Python version support from 3.7–3.9 to 3.8–3.10 in order to take advantage of new typing features in PY3.8 and formally support the current major version (PR #348)
Modernized and streamlined package builds and configuration (PR #347)
Removed deprecated setup.py and switched from setuptools to build for builds
Refactored and extended text statistics functionality (PR #350)
Added functions for computing measures of lexical diversity, such as the clasic Type-Token-Ratio and modern Hypergeometric Distribution Diversity
Added functions for counting token-level attributes, including morphological features and parts-of-speech, in a convenient form
Refactored all text stats functions to accept a Doc as their first positional arg, suitable for use as custom doc extensions (see below)
Deprecated the TextStats class, since other methods for accessing the underlying functionality were made more accessible and convenient, and there's no longer need for a third method.
Standardized functionality for getting/setting/removing doc extensions (PR #352)
Now, custom extensions are accessed by name, and users have more control over the process:
>>> import textacy
>>> from textacy import extract, text_stats
>>> textacy.set_doc_extensions("extract")
>>> textacy.set_doc_extensions("text_stats.readability")
>>> textacy.remove_doc_extensions("extract.matches")
>>> textacy.make_spacy_doc("This is a test.", "en_core_web_sm")._.flesch_reading_ease()
118.17500000000001
Moved top-level extensions into spacier.core and extract.bags
Standardized extract and text_stats subpackage extensions to use the new setup, and made them more customizable
Improved package code, tests, and docs
Fixed outdated code and comments in the "Quickstart" guide, then renamed it "Walkthrough" since it wasn't actually quick; added a new and, yes, quick "Quickstart" guide to fill the gap (PR #353)
Added a pytest conftest file to improve maintainability and consistency of unit test suite (PR #353)
Improved quality and consistency of type annotations, everywhere (PR #349)
Note: Bumped Python version support from 3.7–3.9 to 3.8–3.10 in order to take advantage of new typing features in PY3.8 and formally support the current major version (PR #348)
Modernized and streamlined package builds and configuration (PR #347)
Removed deprecated setup.py and switched from setuptools to build for builds
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Bumps textacy from 0.11.0 to 0.12.0.
Release notes
Sourced from textacy's releases.
Changelog
Sourced from textacy's changelog.
Commits
40cd12f
Bump pkg version, 0.11 => 0.124c5da13
Update changelog for new version3e0644b
Make tutorial calls less verbose0e2d27e
Add fix for removing un-set doc extensions7e47ae1
Deprecate the TextStats class64c84ac
Add utils func for getting module function names6f39292
Merge pull request #353 from chartbeat-labs/update-tests-and-docsd0c3797
Fix error in quickstart code74e1797
Add an actual quickstart to docs7c1b458
Rename quickstart => walkthroughDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)