Closed random452 closed 1 year ago
I haven't tested that. But if you want an estimation, 3090 should be slightly better than NVIDIA T4 by 1.5~2x. Note that CPU memory also plays an important role. You may expect a slowdown with a smaller CPU memory. You are welcome to post the statistics if you try it out. :)
Thanks, I will try. 64 gigs is enough for 30B, or I should get 128?
It is better to not be that tight. You will need additional spaces for KV cache. There is an option called --pin-weight
. It can make offloading faster when turned on, but which will cause to use CPU memory 2x as the model weights.
So if your CPU can only accommodate 1x the model weights, turn off the --pin-weight
Hello, I have 3090. How fast can I run Erebus 30B if I will use FlexGen with Compression?