-
OpenAI:
- gpt-4o-2024-08-06
Google:
- gemini-1.5-pro-exp-0827
Meta:
- Meta-Llama-3.1-405b-Instruct
Anthropic:
- Claude 3.5 Sonnet
Mistral:
- Mistral-Large-2407
DeepSeek:
- DeepSee…
-
I want to merge Mistral Large with https://huggingface.co/softwareweaver/Twilight-Miqu-146B by adding some layers from Twilight Miqu to Mistral Large using the passthrough method. Is there a better wa…
AshD updated
2 months ago
-
graphrag concurrent_requests: 25 , timeout:180
after 180s, only 5 chat complete , there are too many chat timeout like :
ERROR: Chat completion 2a725977bfa24ff5ad768d0f0cf563d7 cancelled by use…
-
### Prerequisites
- [X] I am running the latest code. Mention the version if possible as well.
- [X] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.…
-
### Describe the bug
There are some Console erros. See below:
https://www.youtube.com/watch?v=ep4v2MtI3_s
### Link to the Bolt URL that caused the error
local host only
### Steps to reproduce
Se…
-
### Describe the bug
Exllama v2 crashes when starting to load in the third gpu. No matter if the order is 3090,3090,A4000 or A4000,3090,3090, when I try to load the Mistral Large 2407 exl2 3.0bpw it …
-
### Which version of assistant are you using?
1.1.0 as configured by Nextcloud AIO
### Which version of Nextcloud are you using?
29.0.7
### Which browser are you using? In case you are using the p…
-
Hi,
I'm using the docker version of Anything LLM and Chroma
```
mintplexlabs/anythingllm:master
chromadb/chroma:0.5.1.dev173
```
The docker host is a small VM with 2 CPU, 4 Go RAM and no GPU.
…
-
### Describe your problem
Hi,
I have just bought a new computer with 4GPU and the VRAM is large enough to run some very large LLM locally like Mistral Large. I'm running backend server with LM St…
-
I'm testing out using flex attention to utilize some custom attention masks. The attention masks I'm working with are causal, except there is usually a relatively small single rectangular area of 0s i…