-
### Your current environment
vllm 0.5.0.post
### 🐛 Describe the bug
vllm 0.5.0.post
transformers
-
qwen# python convert_checkpoint.py --model_dir /code/tensorrt-llm/Qwen1.5-32B-Chat/ --output_dir ./trt_ckpt/qwen1.5-32b/fp16 --dtype float16 --tp_size 4
[TensorRT-LLM] TensorRT-LLM version: 0.11.0.de…
-
### System Info
- 20.04 Ubuntu
- NVIDIA H800
- CUDA version 11.8
### Who can help?
@kaiyux @byshiue
### Information
- [ ] The official example scripts
- [X] My own modified scripts
### Tasks
…
-
### Model description
Here is the model description
> gte-Qwen1.5-7B-instruct is the latest addition to the gte embedding family. This model has been engineered starting from the [Qwen1.5-7B](https:…
-
### What model would you like?
The code of Qwen1.5-MoE has been in the latest Hugging face transformers and we advise you to build from source with command , or you might encounter the following erro…
-
-
安装pip install ./dist/sophon-3.7.0-py3-none-any.whl --force-reinstall
然后运行python python/qwen1_5.py --bmodel models/BM1684X/qwen1.5-1.8b_int4_1dev.bmodel --token python/token_config --dev_id 0 ,发生错误:…
-
### Feature request
https://github.com/QwenLM/Qwen1.5
https://huggingface.co/collections/Qwen/qwen15-65c0a2f577b1ecb76d786524
### Motivation
_No response_
### Other
_No response_
-
root@a:~/qwen/qwen.cpp/qwen_cpp# python3 convert.py -i /root/qwen/Qwen1.5-1.8B -t q4_0 -o qwen1_8b.bin
Special tokens have been added in the vocabulary, make sure the associated word embeddings are f…
-
python build.py --hf_model_dir /app/model/Qwen1.5-14B-Chat \
--dtype float16 \
--remove_input_padding \
--use_gemm_plugin float16 \
…