-
请问可以多卡部署吗?如果可以的话,具体怎么操作可以教教吗?一张3090显存不太够,对话稍微长一些就会爆显存,BMInf速度又太慢。
-
(viscpm) zzz@zzz:~/yz/AllVscodes/VisCPM-main$ python demo_chat.py
use CUDA_MEMORY_CPMBEE_MAX=1g to limit cpmbee cuda memory cost
/home/zzz/anaconda3/envs/viscpm/lib/python3.10/site-packages/bminf/w…
-
使用Mac部署的时候 提示没有使用CUDA的错误
`
(OmniLMM) crz@crzdeMacBook-Air OmniLMM % pip install flash_attn
Collecting flash_attn
Downloading flash_attn-2.5.2.tar.gz (2.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━…
-
重装flash_attn_2也不能解决。
还是同样错误
akan updated
9 months ago
-
### Your question
2024-07-03 01:59:42,163- root:179- ERROR- !!! Exception during processing!!! 'NoneType' object has no attribute 'size'
### Logs
```powershell
2024-07-03 01:59:42,163- root:179- …
-
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
### 该问题是否在FAQ中有解答? | Is there an existing ans…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
how to use llama3 in agent , it seems does not have function_calling
-
如题,想在边缘设备上使用,算力不高1.5 TOPS
-
我观察到minicpm的llm有多个版本,例如fp32,bf16,dpo和sft。
请问多模态模型是使用哪一个llm呢?
-
### System Info
```shell
Optimum main branch, commit bb21ae7f7d572805f6ecdea8e0f02dc6014d57e8
Transformers 4.38.1
OnnxRuntime 1.17.1
PyTorch 2.2.1
TensorRT 8.6.1 (nvcr.io/nvidia/tensorrt:23.10…