Open Junglesl opened 1 year ago
单次推理需要13G的显存,是不是80G的显存大概可以同时支持8个请求呢?
No response
如题
- OS: - Python: - Transformers: - PyTorch: - CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
Is there an existing issue for this?
Current Behavior
单次推理需要13G的显存,是不是80G的显存大概可以同时支持8个请求呢?
Expected Behavior
No response
Steps To Reproduce
如题
Environment
Anything else?
No response