Closed flutter-painter closed 3 months ago
gemma-2b-it-cpu-int4.bin == 1.25 GB (1,346,559,040 bytes)
is that the smallest size? Can it be quantized further?
is that the smallest size? Can it be quantized further?
4 bit quantized models generally provide optimal balance between size and performance and are widely used. Although it should run on devices with atleast snapdragon 600 series, gemma 2 bit (not to be confused with 2B which is 2 billion parameters) quantized model is not present at kaggle yet. Although if you get the weights right, you can quantize it yourself with qLORA with packages like ludwig for instruction fine tuning
Thank you for bringing AI to flutter in such a simple way, I am really looking forward to this lib and the opportunities it open.
Could you please enlighten me on how much weighs Gemma 2gb model ? I presume this lib is quite light so I am mostly concerned about the model overall weight which, even if downloaded from the app and not embedded, could limit the use on certain device.
Are there also any recommendation regarding device specs to run this efficiently ?