-
I have tried the meta-llama/Llama-3-8b-chat-hf chatmodel and togethercomputer/m2-bert-80M-8k-retrieval embedding model with embeddingDimension: 768, but I'm having the following error many times when …
-
I managed to make the Llama Stack server and client work with Ollama on both EC2 (with 24GB GPU) and Mac (tested on 2021 M1 and 2019 2.4GHz i9 MBP, both with 32GB memory). Steps are below:
1. Open …
-
### 🚀 The feature, motivation and pitch
Currently, there is a [parallel_tool_calls](https://github.com/vllm-project/vllm/blob/18b296fdb2248e8a65bf005e7193ebd523b875b6/vllm/entrypoints/openai/protocol…
-
### Jan version
0.5.4
### Describe the Bug
I can successfully load the model for chats, but as soon as I send an image, it crashes.
Context:
- I created a model.json to download the text an…
-
### Description
run python app.py
then:
Traceback (most recent call last):
File "/hy-tmp/kotaemon/app.py", line 13, in
from ktem.main import App # noqa
File "/hy-tmp/kotaemon/libs/…
-
Hello , I have been facing this issue , with PHI 3.5 mini Instruct bnb 4Bit model. I am trying to do the Few shot prompting for my dataset . there i am getting this issue . I case of SFT as well , sam…
-
### 🚀 The feature, motivation and pitch
The fine-tuning with only FSDP works well and sharded checkpoints are saved as `__0_*.distcp, .metadata, and train_params.yam`l. I can see the loss drop reas…
-
# URL
- http://arxiv.org/abs/2411.04109
# Authors
- Archiki Prasad
- Weizhe Yuan
- Richard Yuanzhe Pang
- Jing Xu
- Maryam Fazel-Zarandi
- Mohit Bansal
- Sainbayar Sukhbaatar
- Jason W…
-
Here is the full trace of logs:
> Enter a name for your Llama Stack (e.g. my-local-stack): test
> Enter the image type you want your Llama Stack to be built as (docker or conda): docker
Llama S…
-
### What is the issue?
I have 4 GPU cards and which card is 24G
![image](https://github.com/user-attachments/assets/f938bc27-4c40-4f2b-b855-0535485a7f3e)
It's ok for recognizing short text cont…