Closed Xiaoshu-Zhao closed 3 months ago
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.
Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.
提交前必须检查以下项目
问题类型
本地推理
基础模型
llama-3-chinese-8b-instruct-v2
操作系统
wsl2
详细描述问题
我希望在本地用
wsl2
运行 llama3。我参考了ReadMe
里面的用 hugging face 推理的方法,发现推理的时候会卡住,显卡的占用率也很低。请问是什么问题呢依赖情况(代码类问题务必提供)
我的电脑使用的 RTX 3060, cuda 版本如下:
PyTorch version: 2.3.0+cu121 CUDA available: True CUDA version: 12.1 cuDNN version: 8902 GPU: NVIDIA GeForce RTX 3060
运行日志或截图