Open starlitsky2010 opened 1 year ago
It works well when I use nvcr.io/nvidia/pytorch:22.09-py3 docker container.
But another bug still exists the world_size is always for Bloom model when doing inference.
Although it's been a while and support seems to have been dropped, I got similar problem when running bert_example.py, and in my case, it turns out that it is PyTorch 2.0.0 bug. Maybe your problem is similar one..? https://github.com/pytorch/pytorch/issues/97507
Branch/Tag/Commit
main
Docker Image Version
nvcr.io/nvidia/pytorch:23.04-py3
GPU name
3090
CUDA Driver
530.30.02
Reproduced Steps