-
Hi,
After upgrading to 1.78 today, I can't load mixtral-based 8x7b models anymore.
Other models such as 30b/70b llama-type models work.
I get the same error whether I use vulkan or CLBlast, a…
-
### Python -VV
```shell
N/A
```
### Pip Freeze
```shell
N/A
```
### Reproduction Steps
N/A
### Expected Behavior
N/A
### Additional Context
We have a Mixtral implementation on JAX which wor…
-
Mixtral-8x7B-Instruct-v0.1 don't work, when I load the model in chat mode, it loads the model but not complete and breaks.
Maybe in Hugingface they changes the Modelle or something else.
https:/…
-
Hi,
I see you have built an example for Mistral models that I could build successfully. However, when I try to benchmark such models using GPTSessionBenchmark I get errors like:
`[TensorRT-LLM][ERR…
-
Mistral's e2e demo perf (with tracing, embedding/argmax on host, untilizing on device) is 15.2 t/s/u.
Device perf is 22.3 t/s/u. e2e:device perf ratio = 68%
Dispatch times for 1 decoder layer are…
-
## 🚀 Feature
Mixtral 8x7B is a mixture-of-experts LLM that splits the parameters in 8 distinct groups an I would like to do both training and inference with Thunder.
### Work items
- [x] Run `t…
-
```python
from edsl import Model
import time
models_list = [['Austism/chronos-hermes-13b-v2', 'deep_infra', 0], ['BAAI/bge-base-en-v1.5', 'together', 1], ['BAAI/bge-large-en-v1.5', 'together', …
-
"mixtral-8x22b" is gone ;-(
you can see it (dynamic list, ordered by nb of providers, of the models) :
```
import g4f
all=[]
for model in g4f.models._all_models:
m=g4f.models.ModelUtils.c…
-
Hello,
I changed batch size from 1 (default) to 8, 32 and saw no changes on paperQA behavioural (answer quality end speed), as follows :
```
settings=Settings(
llm=f"openai/mixtral:8x7b",…
-
Putting this here, latency change seems very substantial:
https://github.com/vllm-project/vllm/pull/2090