alephdata / ingest-file

Ingestors extract the contents of mixed unstructured documents into structured (followthemoney) data.
GNU Affero General Public License v3.0
54 stars 26 forks source link

Bump spacy from 3.5.1 to 3.6.1 #509

Closed dependabot[bot] closed 1 year ago

dependabot[bot] commented 1 year ago

Bumps spacy from 3.5.1 to 3.6.1.

Release notes

Sourced from spacy's releases.

v3.6.1: Support for Pydantic v2, find-function CLI and more

✨ New features and improvements

  • Allow Pydantic v2 using transitional v1 support (#12888).
  • Add find-function CLI for finding locations of registered functions (#12757).
  • Add extra spacy[cuda12x] for cupy-cuda12x (#12890).
  • Extend tests for init config and train CLI (#12173).
  • Switch from distutils to setuptools/sysconfig (#12853).

🔴 Bug fixes

  • #12817: Escape annotated HTML tags in displaCy span renderer.
  • #12857: Display model's full base version string in incompatibility warning.
  • #12882: Update <br> tags in displaCy.

📖 Documentation and examples

  • Various documentation corrections and updates.
  • New additions to spaCy Universe:

👥 Contributors

@​adrianeboyd, @​afriedman412, @​arplusman, @​bdura, @​connorbrinton, @​honnibal, @​ines, @​it176131, @​pmbaumgartner, @​rmitsch, @​shadeMe, @​svlandeg, @​thomashacker, @​victorialslocum, @​x-tabdeveloping

v3.6.0: New span finder component and pipelines for Slovenian

✨ New features and improvements

  • NEW: span_finder pipeline component to identify overlapping, unlabeled spans (#12507).
  • Language updates:
    • Add initial support for Malay (#12602).
    • Update Latin defaults to support noun chunks, update lexical/tokenizer defaults and add example sentences (#12538).
  • Add option to return scores separately keyed by component name with spacy evaluate --per-component, Language.evaluate(per_component=True) and Scorer.score(per_component=True) (#12540).
  • Support custom token/lexeme attribute for vectors (#12625).
  • Support spancat_singlelabel in spacy debug data CLI (#12749).
  • Typing updates for PhraseMatcher and SpanGroup (#12642, #12714).

🔴 Bug fixes

  • #12569: Require that all SpanGroup spans come from the current doc.

📦 Trained pipelines updates

We have added new pipelines for Slovenian that use the trainable lemmatizer and floret vectors.

Package UPOS Parser LAS NER F
sl_core_news_sm 96.9 82.1 62.9
sl_core_news_md 97.6 84.3 73.5

... (truncated)

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)