mlc-ai / llm-perf-bench

Apache License 2.0
109 stars 12 forks source link

can 8gb rtx 3060 run the 13b model? #13

Closed hiqsociety closed 11 months ago

hiqsociety commented 11 months ago

can 8gb rtx 3060 run the 13b model?

junrushao commented 11 months ago

I don't think so. In int4 quantization, the weights itself approach around 8G, and if taking KVCache into account, the upper bound could be 10-11 GB