-
-
### Feature request
https://huggingface.co/docs/transformers/main/en/gguf
in above documentation it shows that it loads the gguf model and provided the simple example
from transformers import Aut…
-
I am trying to make working with GPU Tinyllama with:
```bash
./TinyLlama-1.1B-Chat-v1.0.F32.llamafile -ngl 9999
```
But it seem not possible to allocate 66.50 MB of memory on my card, even if I j…
-
We've recently introduced the `--hf-repo` and `--hf-file` helper args to `common` in https://github.com/ggerganov/llama.cpp/pull/6234:
```
ref #4735 #5501 #6085 #6098
Sample usage:
./bin/mai…
-
f32 will not start. i just converted the same model as q40 and seems to work fine. i tried with `./dllama inference` as well
f32:
```sh
sudo nice -n -20 ./dllama inference --model models/TinyLla…
-
### 🐛 Describe the bug
Runtime Error noticed while running llama2 on aarch64 Grace super server, onednn built on the native system from https://github.com/oneapi-src/oneDNN/blob/main/README.md and A…
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a…
-
-
### What is the issue?
Background:
Kubernetes 1.31 introduced a new feature: [Read-Only Volumes Based on OCI Artifacts](https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/).…
-
DataChunkRecipe is not working when used in litgpt's TinyLlama pretraining example
error: AttributeError: 'SlimPajamaDataRecipe' object has no attribute 'is_generator'
the type of SlimPajamaDataReci…