-
### Jan version
0.5.3
### Describe the Bug
I imported many models and for some of them, they are failing to load if I selected my both graphic cards (RTX 3060 12Go).
If I unselect one of them, the…
-
I have tried the meta-llama/Llama-3-8b-chat-hf chatmodel and togethercomputer/m2-bert-80M-8k-retrieval embedding model with embeddingDimension: 768, but I'm having the following error many times when …
-
### Description
Hi,
When i send a request to AWS Bedrock, specifically to Llama model below.
```
static const String metaLlamaID = 'us.meta.llama3-2-1b-instruct-v1:0';
static const String …
-
### Describe the feature
I tried the vLLM and LMDeploy using the following command:
```
python run.py \
--datasets humaneval_gen \
--hf-type chat \
--hf-path meta-llama/Meta-Llama-3-…
-
I just installed llama-cpp-2 on Windows and when tried to build it failed with error.
_Error_
```console
cargo build -p llama-cpp-2
Compiling windows_x86_64_msvc v0.52.6
Compiling windows…
-
### What happened?
As seen here:
https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md
The llama.cpp server should support --prompt-cache [FNAME]
I have not been abl…
-
vllm updated to use pytorch 2.5 recently, so we can benchmark torchao with torch.compile now (previously blocked by 2.5 update)
1. install most recent vllm: `pip install https://vllm-wheels.s3.us…
-
Hello , I have been facing this issue , with PHI 3.5 mini Instruct bnb 4Bit model. I am trying to do the Few shot prompting for my dataset . there i am getting this issue . I case of SFT as well , sam…
-
llamafile is a local app (similar to llama.cpp) to run llms in a distributed way from a single file
library can be used on both `.gguf` and `.llamafile` files
repo : https://github.com/Mozilla…
-
Problem - When you use in bot many different providers with many different models - it is very hard to find needed one without sorting and filtering from one shared big list of models.
Enchanceme…