mlc-ai / llm-perf-bench

Apache License 2.0
114 stars 12 forks source link

can 8gb rtx 3060 run the 13b model? #13

Closed hiqsociety closed 1 year ago

hiqsociety commented 1 year ago

can 8gb rtx 3060 run the 13b model?

junrushao commented 1 year ago

I don't think so. In int4 quantization, the weights itself approach around 8G, and if taking KVCache into account, the upper bound could be 10-11 GB