abetlen/llama-cpp-python (llama-cpp-python)
### [`v0.3.0`](https://redirect.github.com/abetlen/llama-cpp-python/blob/HEAD/CHANGELOG.md#030)
[Compare Source](https://redirect.github.com/abetlen/llama-cpp-python/compare/v0.2.90...v0.3.0)
- feat: Update llama.cpp to [ggerganov/llama.cpp@`ea9c32b`](https://redirect.github.com/ggerganov/llama.cpp/commit/ea9c32be71b91b42ecc538bd902e93cbb5fb36cb)
- feat: Enable detokenizing special tokens with special=True by [@benniekiss](https://redirect.github.com/benniekiss) in [#1596](https://redirect.github.com/abetlen/llama-cpp-python/issues/1596)
- feat(ci): Speed up CI workflows using uv, add support for CUDA 12.5 wheels by [@Smartappli](https://redirect.github.com/Smartappli) in [`e529940`](https://redirect.github.com/abetlen/llama-cpp-python/commit/e529940f45d42ed8aa31334123b8d66bc67b0e78)
- feat: Add loading sharded GGUF files from HuggingFace with Llama.from_pretrained(additional_files=\[...]) by [@Gnurro](https://redirect.github.com/Gnurro) in [`84c0920`](https://redirect.github.com/abetlen/llama-cpp-python/commit/84c092063e8f222758dd3d60bdb2d1d342ac292e)
- feat: Add option to configure n_ubatch by [@abetlen](https://redirect.github.com/abetlen) in [`6c44a3f`](https://redirect.github.com/abetlen/llama-cpp-python/commit/6c44a3f36b089239cb6396bb408116aad262c702)
- feat: Update sampling API for llama.cpp. Sampling now uses sampler chain by [@abetlen](https://redirect.github.com/abetlen) in [`f8fcb3e`](https://redirect.github.com/abetlen/llama-cpp-python/commit/f8fcb3ea3424bcfba3a5437626a994771a02324b)
- fix: Don't store scores internally unless logits_all=True. Reduces memory requirements for large context by [@abetlen](https://redirect.github.com/abetlen) in [`29afcfd`](https://redirect.github.com/abetlen/llama-cpp-python/commit/29afcfdff5e75d7df4c13bad0122c98661d251ab)
- fix: Fix memory allocation of ndarray in by [@xu-song](https://redirect.github.com/xu-song) in [#1704](https://redirect.github.com/abetlen/llama-cpp-python/issues/1704)
- fix: Use system message in og qwen format by [@abetlen](https://redirect.github.com/abetlen) in [`98eb092`](https://redirect.github.com/abetlen/llama-cpp-python/commit/98eb092d3c6e7c142c4ba2faaca6c091718abbb3)
Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
[ ] If you want to rebase/retry this PR, check this box
This PR contains the following updates:
==0.2.90
->==0.3.0
Release Notes
abetlen/llama-cpp-python (llama-cpp-python)
### [`v0.3.0`](https://redirect.github.com/abetlen/llama-cpp-python/blob/HEAD/CHANGELOG.md#030) [Compare Source](https://redirect.github.com/abetlen/llama-cpp-python/compare/v0.2.90...v0.3.0) - feat: Update llama.cpp to [ggerganov/llama.cpp@`ea9c32b`](https://redirect.github.com/ggerganov/llama.cpp/commit/ea9c32be71b91b42ecc538bd902e93cbb5fb36cb) - feat: Enable detokenizing special tokens with special=True by [@benniekiss](https://redirect.github.com/benniekiss) in [#1596](https://redirect.github.com/abetlen/llama-cpp-python/issues/1596) - feat(ci): Speed up CI workflows using uv, add support for CUDA 12.5 wheels by [@Smartappli](https://redirect.github.com/Smartappli) in [`e529940`](https://redirect.github.com/abetlen/llama-cpp-python/commit/e529940f45d42ed8aa31334123b8d66bc67b0e78) - feat: Add loading sharded GGUF files from HuggingFace with Llama.from_pretrained(additional_files=\[...]) by [@Gnurro](https://redirect.github.com/Gnurro) in [`84c0920`](https://redirect.github.com/abetlen/llama-cpp-python/commit/84c092063e8f222758dd3d60bdb2d1d342ac292e) - feat: Add option to configure n_ubatch by [@abetlen](https://redirect.github.com/abetlen) in [`6c44a3f`](https://redirect.github.com/abetlen/llama-cpp-python/commit/6c44a3f36b089239cb6396bb408116aad262c702) - feat: Update sampling API for llama.cpp. Sampling now uses sampler chain by [@abetlen](https://redirect.github.com/abetlen) in [`f8fcb3e`](https://redirect.github.com/abetlen/llama-cpp-python/commit/f8fcb3ea3424bcfba3a5437626a994771a02324b) - fix: Don't store scores internally unless logits_all=True. Reduces memory requirements for large context by [@abetlen](https://redirect.github.com/abetlen) in [`29afcfd`](https://redirect.github.com/abetlen/llama-cpp-python/commit/29afcfdff5e75d7df4c13bad0122c98661d251ab) - fix: Fix memory allocation of ndarray in by [@xu-song](https://redirect.github.com/xu-song) in [#1704](https://redirect.github.com/abetlen/llama-cpp-python/issues/1704) - fix: Use system message in og qwen format by [@abetlen](https://redirect.github.com/abetlen) in [`98eb092`](https://redirect.github.com/abetlen/llama-cpp-python/commit/98eb092d3c6e7c142c4ba2faaca6c091718abbb3)Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.