QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Apache License 2.0
13.59k stars 1.11k forks source link

边缘段部署 #1153

Closed SoulProficiency closed 6 months ago

SoulProficiency commented 6 months ago

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

1.我看rk提供了qwen1.8B的测试结果,请问这一块qwen官方有相关资料吗?rk那边我已经询问,没有回复。 2.jetson部署使用tensort-llm可以实现吗?

期望行为 | Expected Behavior

No response

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):

备注 | Anything else?

No response

jklj077 commented 6 months ago
  1. To assess the performance of our models across various tasks, kindly refer to the model card, such as this example: https://huggingface.co/Qwen/Qwen-1_8B-Chat. Please note that we do not have information about specific edge device performance. However, from the perspective of model effectiveness, especially for models of smaller sizes, we suggest try using Qwen1.5.

  2. We currently lack definitive information on this matter. Nonetheless, it's a potential avenue to explore. For further guidance, please reach out to TensorT-LLM directly.

SoulProficiency commented 6 months ago
  1. To assess the performance of our models across various tasks, kindly refer to the model card, such as this example: https://huggingface.co/Qwen/Qwen-1_8B-Chat. Please note that we do not have information about specific edge device performance. However, from the perspective of model effectiveness, especially for models of smaller sizes, we suggest try using Qwen1.5.
  2. We currently lack definitive information on this matter. Nonetheless, it's a potential avenue to explore. For further guidance, please reach out to TensorT-LLM directly.

收到,谢谢