-
# Bug Report
{
"model": "llava-llama3:8b",
"prompt": "tell me a story!",
"stream": false
} this is the post body …
-
### System Info
NVIDIA L20
CUDA 12.3
TensorRT-LLM 0.9.0.dev2024032600
### Who can help?
@byshiue
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
-…
-
`from PIL import Image
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
import torch
MODEL_NAME = "openbmb/MiniCPM-V-2_6"
image = Image.open("dubu.png").con…
-
### What happened?
We expected to see similar performance in llama.cpp when compared to ipex-llm. But llama.cpp was almost two times slower than ipex-llm given all the parameters were the same.
…
-
**env:**
2080Ti * 2
cuda_12.3.r12.3/compiler.33567101_0
python3.9
pip install "sglang[all]"
**error:**
new fill batch. #seq: 1. #cached_token: 0. #new_token: 8. #remaining_req: 0. #running_req…
-
**Motivation**
- Currently we are using `source_url` as the single binary URL to download models, this:
- Makes it harder for user to copy their model over and use - as the model name is confusin…
-
I am using an M1, on commit 928e0b70.
When I run
`./llava-cli -m ./models/llava-v1.6-mistral-7b/ggml-mistral-7b-q_5_k.gguf --mmproj ./models/llava-v1.6-mistral-7b/mmproj-mistral7b-f16-q6_k.gguf …
-
when I using
`python -m qllm --model=/root/models/baichuan-inc/Baichuan2-7B-Base --method=gptq --nsamples=64 --wbits=4 --groupsize=128 --save /root/models/baichuan-inc/Baichuan2-7B-Base_gptq_4b --ex…
-
## Issue
Currently, all images, video and audio in the crawled page are returned in `result.media` like follows.
```
{
audio:[],
images:[
{
src:"https://cdn.com/path/to/image",
alt:"mobile_i…
-
Hi @BoyaWu10, thanks for your great project.
I find it difficult to get more useful information about SVIT from just a few images from the paper.
Have you tested SVIT in some traditional tasks, e.g.…