hcss-utils / spacy-phrases

Extract phrases using spaCy.
2 stars 0 forks source link

Bump textacy from 0.11.0 to 0.12.0 #24

Closed dependabot[bot] closed 2 years ago

dependabot[bot] commented 2 years ago

Bumps textacy from 0.11.0 to 0.12.0.

Release notes

Sourced from textacy's releases.

more text stats, consistent doc extensions, better packaging

New and Changed

  • Refactored and extended text statistics functionality (PR #350)
    • Added functions for computing measures of lexical diversity, such as the clasic Type-Token-Ratio and modern Hypergeometric Distribution Diversity
    • Added functions for counting token-level attributes, including morphological features and parts-of-speech, in a convenient form
    • Refactored all text stats functions to accept a Doc as their first positional arg, suitable for use as custom doc extensions (see below)
    • Deprecated the TextStats class, since other methods for accessing the underlying functionality were made more accessible and convenient, and there's no longer need for a third method.
  • Standardized functionality for getting/setting/removing doc extensions (PR #352)
    • Now, custom extensions are accessed by name, and users have more control over the process:

      >>> import textacy
      >>> from textacy import extract, text_stats
      >>> textacy.set_doc_extensions("extract")
      >>> textacy.set_doc_extensions("text_stats.readability")
      >>> textacy.remove_doc_extensions("extract.matches")
      >>> textacy.make_spacy_doc("This is a test.", "en_core_web_sm")._.flesch_reading_ease()
      118.17500000000001
      
    • Moved top-level extensions into spacier.core and extract.bags

    • Standardized extract and text_stats subpackage extensions to use the new setup, and made them more customizable

  • Improved package code, tests, and docs
    • Fixed outdated code and comments in the "Quickstart" guide, then renamed it "Walkthrough" since it wasn't actually quick; added a new and, yes, quick "Quickstart" guide to fill the gap (PR #353)
    • Added a pytest conftest file to improve maintainability and consistency of unit test suite (PR #353)
    • Improved quality and consistency of type annotations, everywhere (PR #349)
    • Note: Bumped Python version support from 3.7–3.9 to 3.8–3.10 in order to take advantage of new typing features in PY3.8 and formally support the current major version (PR #348)
    • Modernized and streamlined package builds and configuration (PR #347)
      • Removed deprecated setup.py and switched from setuptools to build for builds
      • Consolidated tool configuration in pyproject.toml
      • Extended and tidied up dev-oriented Makefile
      • Addressed some CI/CD issues

Fixed

  • Added missing import, args in TextStats docs (PR #331, Issue #334)
  • Fixed normalization in YAKE keyword extraction (PR #332)
  • Fixed text encoding issue when loading ConceptNet data on Windows systems (Issue #345)

Contributors

Thanks to @​austinjp, @​scarroll32, @​MirkoLenz for their help!

Changelog

Sourced from textacy's changelog.

0.12.0 (2021-12-06)

  • Refactored and extended text statistics functionality (PR #350)
    • Added functions for computing measures of lexical diversity, such as the clasic Type-Token-Ratio and modern Hypergeometric Distribution Diversity
    • Added functions for counting token-level attributes, including morphological features and parts-of-speech, in a convenient form
    • Refactored all text stats functions to accept a Doc as their first positional arg, suitable for use as custom doc extensions (see below)
    • Deprecated the TextStats class, since other methods for accessing the underlying functionality were made more accessible and convenient, and there's no longer need for a third method.
  • Standardized functionality for getting/setting/removing doc extensions (PR #352)
    • Now, custom extensions are accessed by name, and users have more control over the process:

      >>> import textacy
      >>> from textacy import extract, text_stats
      >>> textacy.set_doc_extensions("extract")
      >>> textacy.set_doc_extensions("text_stats.readability")
      >>> textacy.remove_doc_extensions("extract.matches")
      >>> textacy.make_spacy_doc("This is a test.", "en_core_web_sm")._.flesch_reading_ease()
      118.17500000000001
      
    • Moved top-level extensions into spacier.core and extract.bags

    • Standardized extract and text_stats subpackage extensions to use the new setup, and made them more customizable

  • Improved package code, tests, and docs
    • Fixed outdated code and comments in the "Quickstart" guide, then renamed it "Walkthrough" since it wasn't actually quick; added a new and, yes, quick "Quickstart" guide to fill the gap (PR #353)
    • Added a pytest conftest file to improve maintainability and consistency of unit test suite (PR #353)
    • Improved quality and consistency of type annotations, everywhere (PR #349)
    • Note: Bumped Python version support from 3.7–3.9 to 3.8–3.10 in order to take advantage of new typing features in PY3.8 and formally support the current major version (PR #348)
    • Modernized and streamlined package builds and configuration (PR #347)
      • Removed deprecated setup.py and switched from setuptools to build for builds
      • Consolidated tool configuration in pyproject.toml
      • Extended and tidied up dev-oriented Makefile
      • Addressed some CI/CD issues

Fixed

  • Added missing import, args in TextStats docs (PR #331, Issue #334)
  • Fixed normalization in YAKE keyword extraction (PR #332)
  • Fixed text encoding issue when loading ConceptNet data on Windows systems (Issue #345)

Contributors

Thanks to @​austinjp, @​scarroll32, @​MirkoLenz for their help!

Commits
  • 40cd12f Bump pkg version, 0.11 => 0.12
  • 4c5da13 Update changelog for new version
  • 3e0644b Make tutorial calls less verbose
  • 0e2d27e Add fix for removing un-set doc extensions
  • 7e47ae1 Deprecate the TextStats class
  • 64c84ac Add utils func for getting module function names
  • 6f39292 Merge pull request #353 from chartbeat-labs/update-tests-and-docs
  • d0c3797 Fix error in quickstart code
  • 74e1797 Add an actual quickstart to docs
  • 7c1b458 Rename quickstart => walkthrough
  • Additional commits viewable in compare view


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)