Size concern - Githubissues

DenisovAV / flutter_gemma

The Flutter plugin allows running the Gemma AI model locally on a device from a Flutter application.

MIT License

55 stars 17 forks source link

Size concern #4

Closed flutter-painter closed 3 months ago

flutter-painter commented 6 months ago

Thank you for bringing AI to flutter in such a simple way, I am really looking forward to this lib and the opportunities it open.

Could you please enlighten me on how much weighs Gemma 2gb model ? I presume this lib is quite light so I am mostly concerned about the model overall weight which, even if downloaded from the app and not embedded, could limit the use on certain device.

Are there also any recommendation regarding device specs to run this efficiently ?

flutter-painter commented 6 months ago

gemma-2b-it-cpu-int4.bin == 1.25 GB (1,346,559,040 bytes)

echogit commented 5 months ago

is that the smallest size? Can it be quantized further?

AdvaitKale01 commented 4 months ago

is that the smallest size? Can it be quantized further?

4 bit quantized models generally provide optimal balance between size and performance and are widely used. Although it should run on devices with atleast snapdragon 600 series, gemma 2 bit (not to be confused with 2B which is 2 billion parameters) quantized model is not present at kaggle yet. Although if you get the weights right, you can quantize it yourself with qLORA with packages like ludwig for instruction fine tuning