-
**LocalAI version:**
latest docker image
**Environment, CPU architecture, OS, and Version:**
Ryzen 9 3900X -> 12 Cores 24 Threads
windows 10 -> wsl (5.15.90.1-microsoft-standard-WSL2 ) docke…
-
Hi, I'm running local-ai in Kubernetes and download the model ggml-gpt4all-j in the same way as explained [here](https://github.com/go-skynet/LocalAI#run-localai-in-kubernetes), but got this error:
`…
-
From here: https://localai.io/models/#useful-links-and-resources
> Keep in mind models compatible with LocalAI must be quantized in the `ggml` format.
Is the GGUF extension supported by LocalAI?…
-
nanoGPT is great, but it would be nice to experiment with different variations from huggingface.
-
I'm trying to convert the weights as per the example but running into an issue.
After `mkdir huggingface_models \
&& python tools/convert_to_hf_gptneox.py \
--ckpt-path model_ckpts/GPT-Neo-…
-
I am working on integrating GPT-NeoX and Pythia support into GPTQ-for-LLaMa, aiming to add 4-bit GPTQ quantization and inference capabilities. This would enable a NeoX20B to run on a single RTX3090, o…
-
Error log: /mnt/tet/OpenChatKit-main/training
--model-name /mnt/tet/OpenChatKit-main/training/../pretrained/Pythia-6.9B-deduped/EleutherAI_pythia-6.9b-deduped/ --tokenizer-name /mnt/tet/OpenChatKit-…
-
Comparing with the reference self-attention implementation from the `flash_attn` module, I find that flash attention gives significantly different results:
```python
import torch
from flash_attn.…
-
On 2023-07-26, I’m doing:
```
$ sysctl -n machdep.cpu.brand_string
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
$ system_profiler SPHardwareDataType | grep "Model Identifier"
Model Identifier: Mac…
-
**Is your feature request related to a problem? Please describe.**
I train a model with `"model-parallel-size": 2`, and try to convert it to hugging-face model. I refer to `tools/convert_to_hf.py` fo…