-
I used Macos one click to install/Run h2oGpt. Web GUI at port 7860 works and I got the interface.
1. first, trying to load theBloke/Mistral-7B-Instruct-v0.1-GGUF, error "File "transformers/tokeniza…
-
The Ability to select between meta-llama/Llama-2-70b-chat-hf and OpenAssistant/oasst-sft-6-llama-30b would be nice.
-
Not sure how easy it is to add models, but this one is proving the best so far, and is available on Hugging Chat.
Model: [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mix…
-
**LocalAI version:**
I am on commit: **574fa67bdcafd618859fcda4d239f10f326182a6**
**Environment, CPU architecture, OS, and Version:**
I am on Windows and using WSL2 with Ubuntu 22.04:
…
-
After running:
> docker run --gpus all --shm-size 1g -p 8080:80 -v $PWD/data:/data ghcr.io/huggingface/text-generation-inference:0.9 --model-id google/flan-t5-small --num-shard 1
I recieve:
> Run…
-
On Hugging Face there are many files called ggml-model-f16.bin or similar. Once downloaded the user can rename them. The information about its origin gets lost. Updating the file becomes difficult whe…
-
requires the image: `huggingface_text-generation-inference_1.1.0.sqsh` (see https://github.com/huggingface/text-generation-inference/releases/tag/v1.1.0)
**Note, that all of the commands require to…
-
I am trying to run CodeLlama with the following setup:
Model size: 34B
GPUs: 2x A6000 (sm_86)
I'd like to to run the model tensor-parallel across the two GPUs. Correct me if I'm wrong, but the …
-
- [ ] Create philosophical shorts for why LLM may actually "understand"
- [ ] Create a weekly target
- [ ] Reflect on how I would trickle from year to daily vision
- [ ] Create gigs on fastwork
- [ ] …
-
Background of this PoC:
1.[GGML](https://github.com/ggerganov/ggml) is a very compact/highly optimization pure C/C++ machine learning library. GGML is also the solid cornerstone of the amazing [whi…