abetlen/llama-cpp-python (llama_cpp_python)
### [`v0.3.2`](https://redirect.github.com/abetlen/llama-cpp-python/blob/HEAD/CHANGELOG.md#032)
[Compare Source](https://redirect.github.com/abetlen/llama-cpp-python/compare/v0.3.1...v0.3.2)
- feat: Update llama.cpp to [ggerganov/llama.cpp@`74d73dc`](https://redirect.github.com/ggerganov/llama.cpp/commit/74d73dc85cc2057446bf63cc37ff649ae7cebd80)
### [`v0.3.1`](https://redirect.github.com/abetlen/llama-cpp-python/blob/HEAD/CHANGELOG.md#031)
[Compare Source](https://redirect.github.com/abetlen/llama-cpp-python/compare/v0.3.0...v0.3.1)
- feat: Update llama.cpp to [ggerganov/llama.cpp@`c919d5d`](https://redirect.github.com/ggerganov/llama.cpp/commit/c919d5db39c8a7fcb64737f008e4b105ee0acd20)
- feat: Expose libggml in internal APIs by [@abetlen](https://redirect.github.com/abetlen) in [#1761](https://redirect.github.com/abetlen/llama-cpp-python/issues/1761)
- fix: Fix speculative decoding by [@abetlen](https://redirect.github.com/abetlen) in [`9992c50`](https://redirect.github.com/abetlen/llama-cpp-python/commit/9992c5084a3df2f533e265d10f81d4269b97a1e6) and [`e975dab`](https://redirect.github.com/abetlen/llama-cpp-python/commit/e975dabf74b3ad85689c9a07719cbb181313139b)
- misc: Rename all_text to remaining_text by [@xu-song](https://redirect.github.com/xu-song) in [#1658](https://redirect.github.com/abetlen/llama-cpp-python/issues/1658)
### [`v0.3.0`](https://redirect.github.com/abetlen/llama-cpp-python/blob/HEAD/CHANGELOG.md#030)
[Compare Source](https://redirect.github.com/abetlen/llama-cpp-python/compare/v0.2.90...v0.3.0)
- feat: Update llama.cpp to [ggerganov/llama.cpp@`ea9c32b`](https://redirect.github.com/ggerganov/llama.cpp/commit/ea9c32be71b91b42ecc538bd902e93cbb5fb36cb)
- feat: Enable detokenizing special tokens with special=True by [@benniekiss](https://redirect.github.com/benniekiss) in [#1596](https://redirect.github.com/abetlen/llama-cpp-python/issues/1596)
- feat(ci): Speed up CI workflows using uv, add support for CUDA 12.5 wheels by [@Smartappli](https://redirect.github.com/Smartappli) in [`e529940`](https://redirect.github.com/abetlen/llama-cpp-python/commit/e529940f45d42ed8aa31334123b8d66bc67b0e78)
- feat: Add loading sharded GGUF files from HuggingFace with Llama.from_pretrained(additional_files=\[...]) by [@Gnurro](https://redirect.github.com/Gnurro) in [`84c0920`](https://redirect.github.com/abetlen/llama-cpp-python/commit/84c092063e8f222758dd3d60bdb2d1d342ac292e)
- feat: Add option to configure n_ubatch by [@abetlen](https://redirect.github.com/abetlen) in [`6c44a3f`](https://redirect.github.com/abetlen/llama-cpp-python/commit/6c44a3f36b089239cb6396bb408116aad262c702)
- feat: Update sampling API for llama.cpp. Sampling now uses sampler chain by [@abetlen](https://redirect.github.com/abetlen) in [`f8fcb3e`](https://redirect.github.com/abetlen/llama-cpp-python/commit/f8fcb3ea3424bcfba3a5437626a994771a02324b)
- fix: Don't store scores internally unless logits_all=True. Reduces memory requirements for large context by [@abetlen](https://redirect.github.com/abetlen) in [`29afcfd`](https://redirect.github.com/abetlen/llama-cpp-python/commit/29afcfdff5e75d7df4c13bad0122c98661d251ab)
- fix: Fix memory allocation of ndarray in by [@xu-song](https://redirect.github.com/xu-song) in [#1704](https://redirect.github.com/abetlen/llama-cpp-python/issues/1704)
- fix: Use system message in og qwen format by [@abetlen](https://redirect.github.com/abetlen) in [`98eb092`](https://redirect.github.com/abetlen/llama-cpp-python/commit/98eb092d3c6e7c142c4ba2faaca6c091718abbb3)
Configuration
š Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
š¦ Automerge: Enabled.
ā» Rebasing: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.
š Ignore: Close this PR and you won't be reminded about this update again.
[ ] If you want to rebase/retry this PR, check this box
This PR contains the following updates:
==0.2.90
->==0.3.2
Release Notes
abetlen/llama-cpp-python (llama_cpp_python)
### [`v0.3.2`](https://redirect.github.com/abetlen/llama-cpp-python/blob/HEAD/CHANGELOG.md#032) [Compare Source](https://redirect.github.com/abetlen/llama-cpp-python/compare/v0.3.1...v0.3.2) - feat: Update llama.cpp to [ggerganov/llama.cpp@`74d73dc`](https://redirect.github.com/ggerganov/llama.cpp/commit/74d73dc85cc2057446bf63cc37ff649ae7cebd80) ### [`v0.3.1`](https://redirect.github.com/abetlen/llama-cpp-python/blob/HEAD/CHANGELOG.md#031) [Compare Source](https://redirect.github.com/abetlen/llama-cpp-python/compare/v0.3.0...v0.3.1) - feat: Update llama.cpp to [ggerganov/llama.cpp@`c919d5d`](https://redirect.github.com/ggerganov/llama.cpp/commit/c919d5db39c8a7fcb64737f008e4b105ee0acd20) - feat: Expose libggml in internal APIs by [@abetlen](https://redirect.github.com/abetlen) in [#1761](https://redirect.github.com/abetlen/llama-cpp-python/issues/1761) - fix: Fix speculative decoding by [@abetlen](https://redirect.github.com/abetlen) in [`9992c50`](https://redirect.github.com/abetlen/llama-cpp-python/commit/9992c5084a3df2f533e265d10f81d4269b97a1e6) and [`e975dab`](https://redirect.github.com/abetlen/llama-cpp-python/commit/e975dabf74b3ad85689c9a07719cbb181313139b) - misc: Rename all_text to remaining_text by [@xu-song](https://redirect.github.com/xu-song) in [#1658](https://redirect.github.com/abetlen/llama-cpp-python/issues/1658) ### [`v0.3.0`](https://redirect.github.com/abetlen/llama-cpp-python/blob/HEAD/CHANGELOG.md#030) [Compare Source](https://redirect.github.com/abetlen/llama-cpp-python/compare/v0.2.90...v0.3.0) - feat: Update llama.cpp to [ggerganov/llama.cpp@`ea9c32b`](https://redirect.github.com/ggerganov/llama.cpp/commit/ea9c32be71b91b42ecc538bd902e93cbb5fb36cb) - feat: Enable detokenizing special tokens with special=True by [@benniekiss](https://redirect.github.com/benniekiss) in [#1596](https://redirect.github.com/abetlen/llama-cpp-python/issues/1596) - feat(ci): Speed up CI workflows using uv, add support for CUDA 12.5 wheels by [@Smartappli](https://redirect.github.com/Smartappli) in [`e529940`](https://redirect.github.com/abetlen/llama-cpp-python/commit/e529940f45d42ed8aa31334123b8d66bc67b0e78) - feat: Add loading sharded GGUF files from HuggingFace with Llama.from_pretrained(additional_files=\[...]) by [@Gnurro](https://redirect.github.com/Gnurro) in [`84c0920`](https://redirect.github.com/abetlen/llama-cpp-python/commit/84c092063e8f222758dd3d60bdb2d1d342ac292e) - feat: Add option to configure n_ubatch by [@abetlen](https://redirect.github.com/abetlen) in [`6c44a3f`](https://redirect.github.com/abetlen/llama-cpp-python/commit/6c44a3f36b089239cb6396bb408116aad262c702) - feat: Update sampling API for llama.cpp. Sampling now uses sampler chain by [@abetlen](https://redirect.github.com/abetlen) in [`f8fcb3e`](https://redirect.github.com/abetlen/llama-cpp-python/commit/f8fcb3ea3424bcfba3a5437626a994771a02324b) - fix: Don't store scores internally unless logits_all=True. Reduces memory requirements for large context by [@abetlen](https://redirect.github.com/abetlen) in [`29afcfd`](https://redirect.github.com/abetlen/llama-cpp-python/commit/29afcfdff5e75d7df4c13bad0122c98661d251ab) - fix: Fix memory allocation of ndarray in by [@xu-song](https://redirect.github.com/xu-song) in [#1704](https://redirect.github.com/abetlen/llama-cpp-python/issues/1704) - fix: Use system message in og qwen format by [@abetlen](https://redirect.github.com/abetlen) in [`98eb092`](https://redirect.github.com/abetlen/llama-cpp-python/commit/98eb092d3c6e7c142c4ba2faaca6c091718abbb3)Configuration
š Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
š¦ Automerge: Enabled.
ā» Rebasing: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.
š Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.