pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.36k stars 485 forks source link

Set cuda device before init_process_group #66

Closed yifuwang closed 6 months ago

yifuwang commented 6 months ago

Stack from ghstack (oldest at bottom):