Closed mittaltarkik closed 2 months ago
Hello. You had sb deploy --host-list localhost -i superbench/superbench:v0.10.0-cuda12.4
command which points to not yet released docker image. Most recent available image is superbench/superbench:v0.10.0-cuda12.2 according to https://hub.docker.com/r/superbench/superbench/tags page.
What's the issue, what's expected?: I am facing issue while running SB on Nvidia A100.
How to reproduce it?: VM : Standard ND96asr v4 (96 vcpus, 900 GiB memory) OS : Linux (ubuntu 22.04) Cuda : cuda_12.4.0_550.54 SB Version : 10, Docker file error_logs.txt error_logs.txt : cuda12.2
Log message or shapshot?: Logs file attached
Additional information: Please help us with correct configuration