Support for GEMMA-2-2B-it Model

google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.

https://ai.google.dev/edge/mediapipe

Apache License 2.0

27.42k stars 5.15k forks source link

Support for GEMMA-2-2B-it Model #5594

Closed jfduma closed 3 weeks ago

jfduma commented 2 months ago

MediaPipe Solution (you are using)

Version: 0.10.14

Programming language

No response

Are you willing to contribute it

None

Describe the feature and the current behaviour/state

It seems that Mediapipe currently does not support the GEMMA-2-2B-it model. Is there a plan to support it? When can we expect it to be completed?

Will this change the current API? How?

No response

Who will benefit with this feature?

No response

Please specify the use cases for this feature

convert gemma-2-2b-it for mediapipe

Any Other info

No response

kuaashish commented 2 months ago

Hi @jfduma,

Thank you for submitting this feature request. We believe it aligns with our current roadmap, and we are working on its implementation. However, we can not commit to a specific availability timeline at this moment. We will share your request with our team and notify you once it's supported.

jfduma commented 2 months ago

Thank you for the update. I look forward to hearing more when it's available.

Best, jfduma

kuaashish commented 1 month ago

Hi @jfduma,

Our LLM Task API now supports the Gemma 2-2B models now. You can find more information on our documentation page. Additionally, you can follow this example to convert the models to TFLite.

Thank you!!

tyrmullen commented 1 month ago

Note that the conversion process mentioned in the documentation above will only work for CPU backends, which I believe limits its usage to Android and iOS APIs. While unfortunately this is not mentioned in the paragraph linked to above, the relevant disclaimer can be found on the LLM Inference overview page:

Note: Models mapped using AI Edge Torch can only run on the CPU backend and are therefore limited to Android and iOS.

So running Gemma-2-2B as described above uses a different conversion process which will only work on Android using the CPU backend (and maybe iOS using the CPU backend).

github-actions[bot] commented 1 month ago

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

tyrmullen commented 1 month ago

Gemma-2-2B should now also be runnable on GPU with the latest (0.10.16) release on web. We have not published a pre-converted model, nor released detailed documentation for it yet, but the usual conversion process (from .safetensors) should work there now. So to summarize the current state of support:

Android or iOS: use new conversion process (AI EdgeTorch); runs on CPU
Web: use old conversion process (follow old Gemma 1.1 instructions, but use name "GEMMA2_2B"); runs on GPU

github-actions[bot] commented 1 month ago

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 3 weeks ago

This issue was closed due to lack of activity after being marked stale for past 7 days.