pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.36k stars 485 forks source link

Does `gpt-fast` work on V100 GPUs? #72

Open RomanKoshkin opened 6 months ago

RomanKoshkin commented 6 months ago

Everything works on my A6000s and A100s, but not on the older V100 (says compute capability is low). Are there plans to add support for the legacy devices? Thanks!

Chillee commented 4 months ago

We haven't tested on V100s so I'm not sure. I thought it worked but haven't checked.

Chillee commented 4 months ago

I actually tried it just now. The issue is that V100 has poor bfloat16 support. If you just change all the bfloat16 instances to float16 it should work.