Open Godlovecui opened 5 months ago
Hi @Godlovecui , I saw u're using the 0.9.0 trtllm, is it possible to try the latest main branch and see if the issue still exists or not?
Have you tried
nvidia-smi topo -p2p r
To inspect if the drivers for your GPUS are installed and support the peer to peer access?
Also I have encounterd similar issues where my default GPU installation required me to compile with disabled
on the use_custom_all_reduce
flag
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."
System Info
RTX 8*4090 version: TensorRT-LLM: v0.9.0 tensorrtllm_backend: v0.9.0
Who can help?
@kaiyux @BY
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
None
Expected behavior
None
actual behavior
None
additional notes
When I deploy llama3-8B in trition server, it raises below error: but, it also print server launch successfully flag: However, when I send requests to server,
How to fix it? Thank you~