Open schopra8 opened 2 weeks ago
I've found the line that causes the hang -- but I have no clue why this is a problem:
I'm running with dp=1
and tp=1
(i.e., on a single GPU). If I run torch.cuda.set_device(0)
in my python script -- before I create the Runtime, everything works as expected.
We're trying to run the latest version of
sg-lang
in a Docker Container (PyTorch 2.3.0, CUDA 12.1) -- but the runtime instantiation gets stuck. It's start loading the model onto the GPU and then hangs.We've been able to run
sg-lang
without any problems on the host operating system. So we pip froze the requirements on the host instance and installed these exact packages within the Docker Container -- but we're still hitting this model loading hang.Has anyone seen this issue before? Any ideas what might be going wrong?