:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
# uname -a
Linux localai-ix-chart-f8bbbb7c7-x6xx9 6.1.42-production+truenas #2 SMP PREEMPT_DYNAMIC Mon Aug 14 23:21:26 UTC 2023 x86_64 GNU/Linux
# nvidia-smi
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla P40 Off | 00000000:23:00.0 Off | Off |
| N/A 24C P8 10W / 250W | 0MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 Tesla P40 Off | 00000000:24:00.0 Off | Off |
| N/A 23C P8 9W / 250W | 0MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
# cat /proc/cpuinfo |grep "model name" | nl
1 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
2 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
3 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
4 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
5 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
6 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
7 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
8 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
9 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
10 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
11 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
12 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
13 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
14 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
15 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
16 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
17 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
18 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
19 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
20 model name : Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
# cat /proc/meminfo | grep Mem
MemTotal: 32701568 kB
MemFree: 18305148 kB
MemAvailable: 18767368 kB
Describe the bug
Assert crash for GGML
To Reproduce
Use wizardlm-33b-v1.0-uncensored.ggmlv3.q4_K_S.bin model.
curl $LOCALAI/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "thebloke__wizardlm-33b-v1.0-uncensored-ggml__wizardlm-33b-v1.0-uncensored.ggmlv3.q4_k_s.bin",
"messages": [{"role": "user", "content": "Give me a HTTP REST server made in rust that uses sqlite."}],
"temperature": 0.9
}' | jq
LocalAI version:
Environment, CPU architecture, OS, and Version:
Describe the bug Assert crash for GGML
To Reproduce Use
wizardlm-33b-v1.0-uncensored.ggmlv3.q4_K_S.bin
model.Yaml Config File:
Call:
Response from CURL:
Expected behavior Work
Logs