-
### Your current environment
```text
docker run --rm --runtime nvidia --gpus all --name vllm-qwen72b -v ~/.cache/huggingface:/root/.cache/huggingface \
-v /data1/Download/models/Qwen-7…
-
First I success install the requriment package and start the 4 server.
Then I success register the Qwen model by
` sllm-cli deploy --model Qwen2-1.5B-Instruct`
At Last, I send a request:
`
…
-
`lm_eval --model local-chat-completions --tasks gpqa_main_cot_zeroshot --model_args model=Qwen/Qwen2-72B-Instruct,base_url=https://api.together.xyz/v1 --output_path ./gpqa/result/Qwen2 --use_cache ./g…
-
Currently, I'm using fastchat==0.2.36 and vllm==0.4.3 to deploy Qwen model for inference service. Here's the command for starting the service on my two servers.
server1:
`python3.9 -m fastchat.serve…
-
### Your current environment
driver 1.17
vllm 0.5.3.post1+gaudi117
```text
export VLLM_GRAPH_RESERVED_MEM=0.1
export VLLM_GRAPH_PROMPT_RATIO=0.9
export VLLM_PROMPT_S…
-
I found cmd path not suit for it.
```
open /Users/Apple/.qwen-cli/api-config.json: no such file or directory
Error: open /Users/Apple/.qwen-cli/api-config.json: no such file or directory
```
but …
-
how can i use openllm for local lora model?
-
I am getting this error:
ValueError: Attempted to load model 'llava_hf', but no model for this name found! Supported model names: llava, qwen_vl, fuyu, batch_gpt4, gpt4v, instructblip, minicpm_v, c…
-
If I want to test the qwen model with the API, can I just use the GPTAPI class and replace the model URL with the qwen one?
-
Dashscope is qwen's native API. They also host a lot of other models on their cloud, for example a llama3-70b-instruct. Can we add support for dashscope API?