-
### Summary
- Provide k-quant models
- Maintain existing gguf models
- Embedding models
- [x] [second-state/Nomic-embed-text-v1.5-Embedding-GGUF](https://huggingface.co/second-state/Nomic-…
-
Our team released a fantastic end-side model called MiniCPM-2B recently.
Experimental Result:MiniCPM-2B outperforms Llama2-70B-Chat, Mistral-7B, etc. on MTBench.Runs ultra-fast on Apple Silicon.
…
-
### Question
I’m currently using PrivateGPT v0.6.1 with `Llama-CPP` support on a Windows machine with `qdrant` DB. LLM used is `Mistral-7B-Instruct-v0.3` and embedding model is `BAAI/bge-m3`.
I …
-
### What happened?
When calling [the `POST /v1/chat/completion/` endpoint](https://platform.openai.com/docs/api-reference/chat/create), the user can add a `name` field to each message:
> `name` `s…
-
### System Info
- `transformers` version: 4.41.2
- Platform: Linux-5.15.0-1042-nvidia-x86_64-with-glibc2.35
- Python version: 3.9.18
- Huggingface_hub version: 0.23.3
- Safetensors version: 0.4…
-
Is it possible to add out of the box functionality with OS models like llama 405b?
-
### What is the issue?
NAME ID SIZE MODIFIED
glm-4-9b-chat:latest 5356a47a9286 6.3 GB 3 minutes ago
llama3:latest …
-
### Describe the bug
I noticed yesterday that tokens per second very heavily gets reduced as context length in the prompt gets larger. I tested Exui with my same conda environment that Ooba makes.
…
-
Nomic's [GPT4All](https://gpt4all.io) runs large language models (LLMs) privately on everyday desktops & laptops. It has a Vulkan wrapper allowing all GPUs to work out of the box.
It unfortunatel…
-
List with `models-openai-compatible` is not working because of `"GET not supported for requested URI."` error. But it's still possible to list cloudlfare models with model search API https://developer…