mim-solutions / bert_for_longer_texts

BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation allows fine-tuning.
Other
129 stars 30 forks source link

Bump the pip group across 1 directory with 5 updates #45

Closed dependabot[bot] closed 5 months ago

dependabot[bot] commented 6 months ago

Bumps the pip group with 5 updates in the / directory:

Package From To
transformers 4.36.0 4.38.0
idna 3.4 3.7
pillow 10.2.0 10.3.0
requests 2.31.0 2.32.0
tqdm 4.65.0 4.66.3

Updates transformers from 4.36.0 to 4.38.0

Release notes

Sourced from transformers's releases.

v4.38: Gemma, Depth Anything, Stable LM; Static Cache, HF Quantizer, AQLM

New model additions

💎 Gemma 💎

Gemma is a new opensource Language Model series from Google AI that comes with a 2B and 7B variant. The release comes with the pre-trained and instruction fine-tuned versions and you can use them via AutoModelForCausalLM, GemmaForCausalLM or pipeline interface!

Read more about it in the Gemma release blogpost: https://hf.co/blog/gemma

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b") model = AutoModelForCausalLM.from_pretrained("google/gemma-2b", device_map="auto", torch_dtype=torch.float16)

input_text = "Write me a poem about Machine Learning." input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

outputs = model.generate(**input_ids)

You can use the model with Flash Attention, SDPA, Static cache and quantization API for further optimizations !

  • Flash Attention 2
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b")

model = AutoModelForCausalLM.from_pretrained( "google/gemma-2b", device_map="auto", torch_dtype=torch.float16, attn_implementation="flash_attention_2" )

input_text = "Write me a poem about Machine Learning." input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

outputs = model.generate(**input_ids)

  • bitsandbytes-4bit
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b")

model = AutoModelForCausalLM.from_pretrained( "google/gemma-2b", device_map="auto", load_in_4bit=True ) </tr></table>

... (truncated)

Commits
  • 08ab54a [ gemma] Adds support for Gemma 💎 (#29167)
  • 2de9314 [Maskformer] safely get backbone config (#29166)
  • 476957b 🚨 Llama: update rope scaling to match static cache changes (#29143)
  • 7a4bec6 Release: 4.38.0
  • ee3af60 Add support for fine-tuning CLIP-like models using contrastive-image-text exa...
  • 0996a10 Revert low cpu mem tie weights (#29135)
  • 15cfe38 [Core tokenization] add_dummy_prefix_space option to help with latest is...
  • efdd436 FIX [PEFT / Trainer ] Handle better peft + quantized compiled models (#29...
  • 5e95dca [cuda kernels] only compile them when initializing (#29133)
  • a7755d2 Generate: unset GenerationConfig parameters do not raise warning (#29119)
  • Additional commits viewable in compare view


Updates idna from 3.4 to 3.7

Release notes

Sourced from idna's releases.

v3.7

What's Changed

  • Fix issue where specially crafted inputs to encode() could take exceptionally long amount of time to process. [CVE-2024-3651]

Thanks to Guido Vranken for reporting the issue.

Full Changelog: https://github.com/kjd/idna/compare/v3.6...v3.7

Changelog

Sourced from idna's changelog.

3.7 (2024-04-11) ++++++++++++++++

  • Fix issue where specially crafted inputs to encode() could take exceptionally long amount of time to process. [CVE-2024-3651]

Thanks to Guido Vranken for reporting the issue.

3.6 (2023-11-25) ++++++++++++++++

  • Fix regression to include tests in source distribution.

3.5 (2023-11-24) ++++++++++++++++

  • Update to Unicode 15.1.0
  • String codec name is now "idna2008" as overriding the system codec "idna" was not working.
  • Fix typing error for codec encoding
  • "setup.cfg" has been added for this release due to some downstream lack of adherence to PEP 517. Should be removed in a future release so please prepare accordingly.
  • Removed reliance on a symlink for the "idna-data" tool to comport with PEP 517 and the Python Packaging User Guide for sdist archives.
  • Added security reporting protocol for project

Thanks Jon Ribbens, Diogo Teles Sant'Anna, Wu Tingfeng for contributions to this release.

Commits
  • 1d365e1 Release v3.7
  • c1b3154 Merge pull request #172 from kjd/optimize-contextj
  • 0394ec7 Merge branch 'master' into optimize-contextj
  • cd58a23 Merge pull request #152 from elliotwutingfeng/dev
  • 5beb28b More efficient resolution of joiner contexts
  • 1b12148 Update ossf/scorecard-action to v2.3.1
  • d516b87 Update Github actions/checkout to v4
  • c095c75 Merge branch 'master' into dev
  • 60a0a4c Fix typo in GitHub Actions workflow key
  • 5918a0e Merge branch 'master' into dev
  • Additional commits viewable in compare view


Updates pillow from 10.2.0 to 10.3.0

Release notes

Sourced from pillow's releases.

10.3.0

https://pillow.readthedocs.io/en/stable/releasenotes/10.3.0.html

Changes

... (truncated)

Changelog

Sourced from pillow's changelog.

10.3.0 (2024-04-01)

  • CVE-2024-28219: Use strncpy to avoid buffer overflow #7928 [radarhere, hugovk]

  • Deprecate eval(), replacing it with lambda_eval() and unsafe_eval() #7927 [radarhere, hugovk]

  • Raise ValueError if seeking to greater than offset-sized integer in TIFF #7883 [radarhere]

  • Add --report argument to __main__.py to omit supported formats #7818 [nulano, radarhere, hugovk]

  • Added RGB to I;16, I;16L, I;16B and I;16N conversion #7918, #7920 [radarhere]

  • Fix editable installation with custom build backend and configuration options #7658 [nulano, radarhere]

  • Fix putdata() for I;16N on big-endian #7209 [Yay295, hugovk, radarhere]

  • Determine MPO size from markers, not EXIF data #7884 [radarhere]

  • Improved conversion from RGB to RGBa, LA and La #7888 [radarhere]

  • Support FITS images with GZIP_1 compression #7894 [radarhere]

  • Use I;16 mode for 9-bit JPEG 2000 images #7900 [scaramallion, radarhere]

  • Raise ValueError if kmeans is negative #7891 [radarhere]

  • Remove TIFF tag OSUBFILETYPE when saving using libtiff #7893 [radarhere]

  • Raise ValueError for negative values when loading P1-P3 PPM images #7882 [radarhere]

  • Added reading of JPEG2000 palettes #7870 [radarhere]

  • Added alpha_quality argument when saving WebP images #7872 [radarhere]

... (truncated)

Commits
  • 5c89d88 10.3.0 version bump
  • 63cbfcf Update CHANGES.rst [ci skip]
  • 2776126 Merge pull request #7928 from python-pillow/lcms
  • aeb51cb Merge branch 'main' into lcms
  • 5beb0b6 Update CHANGES.rst [ci skip]
  • cac6ffa Merge pull request #7927 from python-pillow/imagemath
  • f5eeeac Name as 'options' in lambda_eval and unsafe_eval, but '_dict' in deprecated eval
  • facf3af Added release notes
  • 2a93aba Use strncpy to avoid buffer overflow
  • a670597 Update CHANGES.rst [ci skip]
  • Additional commits viewable in compare view


Updates requests from 2.31.0 to 2.32.0

Release notes

Sourced from requests's releases.

v2.32.0

2.32.0 (2024-05-20)

🐍 PYCON US 2024 EDITION 🐍

Security

Improvements

  • verify=True now reuses a global SSLContext which should improve request time variance between first and subsequent requests. It should also minimize certificate load time on Windows systems when using a Python version built with OpenSSL 3.x. (#6667)
  • Requests now supports optional use of character detection (chardet or charset_normalizer) when repackaged or vendored. This enables pip and other projects to minimize their vendoring surface area. The Response.text() and apparent_encoding APIs will default to utf-8 if neither library is present. (#6702)

Bugfixes

  • Fixed bug in length detection where emoji length was incorrectly calculated in the request content-length. (#6589)
  • Fixed deserialization bug in JSONDecodeError. (#6629)
  • Fixed bug where an extra leading / (path separator) could lead urllib3 to unnecessarily reparse the request URI. (#6644)

Deprecations

  • Requests has officially added support for CPython 3.12 (#6503)
  • Requests has officially added support for PyPy 3.9 and 3.10 (#6641)
  • Requests has officially dropped support for CPython 3.7 (#6642)
  • Requests has officially dropped support for PyPy 3.7 and 3.8 (#6641)

Documentation

  • Various typo fixes and doc improvements.

Packaging

  • Requests has started adopting some modern packaging practices. The source files for the projects (formerly requests) is now located in src/requests in the Requests sdist. (#6506)
  • Starting in Requests 2.33.0, Requests will migrate to a PEP 517 build system using hatchling. This should not impact the average user, but extremely old versions of packaging utilities may have issues with the new packaging format.

New Contributors

... (truncated)

Changelog

Sourced from requests's changelog.

2.32.0 (2024-05-20)

Security

Improvements

  • verify=True now reuses a global SSLContext which should improve request time variance between first and subsequent requests. It should also minimize certificate load time on Windows systems when using a Python version built with OpenSSL 3.x. (#6667)
  • Requests now supports optional use of character detection (chardet or charset_normalizer) when repackaged or vendored. This enables pip and other projects to minimize their vendoring surface area. The Response.text() and apparent_encoding APIs will default to utf-8 if neither library is present. (#6702)

Bugfixes

  • Fixed bug in length detection where emoji length was incorrectly calculated in the request content-length. (#6589)
  • Fixed deserialization bug in JSONDecodeError. (#6629)
  • Fixed bug where an extra leading / (path separator) could lead urllib3 to unnecessarily reparse the request URI. (#6644)

Deprecations

  • Requests has officially added support for CPython 3.12 (#6503)
  • Requests has officially added support for PyPy 3.9 and 3.10 (#6641)
  • Requests has officially dropped support for CPython 3.7 (#6642)
  • Requests has officially dropped support for PyPy 3.7 and 3.8 (#6641)

Documentation

  • Various typo fixes and doc improvements.

Packaging

  • Requests has started adopting some modern packaging practices. The source files for the projects (formerly requests) is now located in src/requests in the Requests sdist. (#6506)
  • Starting in Requests 2.33.0, Requests will migrate to a PEP 517 build system using hatchling. This should not impact the average user, but extremely old versions of packaging utilities may have issues with the new packaging format.
Commits
  • d6ebc4a v2.32.0
  • 9a40d12 Avoid reloading root certificates to improve concurrent performance (#6667)
  • 0c030f7 Merge pull request #6702 from nateprewitt/no_char_detection
  • 555b870 Allow character detection dependencies to be optional in post-packaging steps
  • d6dded3 Merge pull request #6700 from franekmagiera/update-redirect-to-invalid-uri-test
  • bf24b7d Use an invalid URI that will not cause httpbin to throw 500
  • 2d5f547 Pin 3.8 and 3.9 runners back to macos-13 (#6688)
  • f1bb07d Merge pull request #6687 from psf/dependabot/github_actions/github/codeql-act...
  • 60047ad Bump github/codeql-action from 3.24.0 to 3.25.0
  • 31ebb81 Merge pull request #6682 from frenzymadness/pytest8
  • Additional commits viewable in compare view


Updates tqdm from 4.65.0 to 4.66.3

Release notes

Sourced from tqdm's releases.

tqdm v4.66.3 stable

  • cli: eval safety (fixes CVE-2024-34062, GHSA-g7vv-2v7x-gj9p)

tqdm v4.66.2 stable

  • pandas: add DataFrame.progress_map (#1549)
  • notebook: fix HTML padding (#1506)
  • keras: fix resuming training when verbose>=2 (#1508)
  • fix format_num negative fractions missing leading zero (#1548)
  • fix Python 3.12 DeprecationWarning on import (#1519)
  • linting: use f-strings (#1549)
  • update tests (#1549)
  • CI: bump actions (#1549)

tqdm v4.66.1 stable

  • fix utils.envwrap types (#1493 <- #1491, #1320 <- #966, #1319)
    • e.g. cloudwatch & kubernetes workaround: export TQDM_POSITION=-1
  • drop mentions of unsupported Python versions

tqdm v4.66.0 stable

  • environment variables to override defaults (TQDM_*) (#1491 <- #1061, #950 <- #614, #1318, #619, #612, #370)
    • e.g. in CI jobs, export TQDM_MININTERVAL=5 to avoid log spam
    • add tests & docs for tqdm.utils.envwrap
  • fix & update CLI completion
  • fix & update API docs
  • minor code tidy: replace os.path => pathlib.Path
  • fix docs image hosting
  • release with CI bot account again (cli/cli#6680)

tqdm v4.65.2 stable

  • exclude examples from distributed wheel (#1492)

tqdm v4.65.1 stable

  • migrate setup.{cfg,py} => pyproject.toml (#1490)
    • fix asv benchmarks
    • update docs
  • fix snap build (#1490)
  • fix & update tests (#1490)
    • fix flaky notebook tests
    • bump pre-commit
    • bump workflow actions
Commits


Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself) - `@dependabot ignore minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself) - `@dependabot ignore ` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself) - `@dependabot unignore ` will remove all of the ignore conditions of the specified dependency - `@dependabot unignore ` will remove the ignore condition of the specified dependency and ignore conditions You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/mim-solutions/bert_for_longer_texts/network/alerts).
dependabot[bot] commented 5 months ago

Superseded by #46.