google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://mediapipe.dev
Apache License 2.0
26.62k stars 5.07k forks source link

Repeated Junk output generated using gemma-2b-it-gpu-int4.bin on Mobile device #5534

Open KosuriSireesha opened 1 month ago

KosuriSireesha commented 1 month ago

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

No

OS Platform and Distribution

Android 14

Mobile device if the issue happens on mobile device

Android Mobile device (Motorola edge 50 Ultra)

Browser and version if the issue happens on browser

No response

Programming Language and version

Kotlin

MediaPipe version

0.10.14

Bazel version

No response

Solution

LLMInference

Android Studio, NDK, SDK versions (if issue is related to building in Android environment)

No response

Xcode & Tulsi version (if issue is related to building for iOS)

No response

Describe the actual behavior

Gemma GPU model giving Junk output from the 2nd query

Describe the expected behaviour

Model needs to give relevant response to the user Query

Standalone code/steps you may have used to try to get what you need

Tested on Android Mobile device .
Followed the steps mentioned at -https://ai.google.dev/edge/mediapipe/solutions/genai/llm_inference/android-
1.Used  LLMInference example  app from  the git https://github.com/googlesamples/mediapipe
2. Added Maven dependency in the build gradle -
 dependencies {
    implementation 'com.google.mediapipe:tasks-genai:0.10.14'
}
https://mvnrepository.com/artifact/com.google.mediapipe/tasks-genai
3 GPU model used in the LlmInference example- gemma-2b-it-gpu-int4.bin(downloaded from https://www.kaggle.com/models/google/gemma/tfLite)

4.Run the LLmInference APP on Mobile device.
5. Enter any query and the model gives response
6.Enter the second query either related to the context of the first query or any other query 
Continuous Junk response results are shown with out Done being sent .
LLm inference App needs to be restarted to use it again 

I am getting response only for one query and from second query getting the Junk output

Other info / Complete Logs

out put of the partial results from LLmInference -

36489: 07-15 16:42:41.352 24038 24072 D ChatViewModel: partialresult first message: катеринакатеринакатерина
    Line 36514: 07-15 16:42:41.455 24038 24072 D ChatViewModel: partialresult append message: катеринакатеринакатерина
    Line 36650: 07-15 16:42:41.565 24038 24071 D ChatViewModel: partialresult append message: катеринакатеринакатерина
    Line 36680: 07-15 16:42:41.666 24038 24073 D ChatViewModel: partialresult append message: катеринакатеринакатерина
    Line 36692: 07-15 16:42:41.771 24038 24308 D ChatViewModel: partialresult append message: катеринакатеринакатерина
    Line 36693: 07-15 16:42:41.873 24038 24073 D ChatViewModel: partialresult append message: катеринакатеринакатерина
    Line 36699: 07-15 16:42:41.977 24038 24072 D ChatViewModel: partialresult append message: катеринакатеринакатерина
    Line 36709: 07-15 16:42:42.081 24038 24071 D ChatViewModel: partialresult append message: катеринакатеринакатерина
    Line 36710: 07-15 16:42:42.186 24038 24071 D ChatViewModel: partialresult append message: катеринакатеринакатерина
    Line 36711: 07-15 16:42:42.286 24038 24073 D ChatViewModel: partialresult append message: катеринакатеринакатерина
    Line 36714: 07-15 16:42:42.387 24038 24071 D ChatViewModel: partialresult append message: катеринакатеринакатерина
kuaashish commented 1 month ago

Hi @KosuriSireesha,

Could you please try running this sample app frome her https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/llm_inference/android and let us know if you encounter the same issue there as well?

Thank you!!

KosuriSireesha commented 1 month ago

Hi @kuaashish , Yes , I do tested the sampleapp from https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/llm_inference/android

Issue is still reproducible

Note :On com.google.mediapipe:tasks-genai:0.10.11 , GPU model is working fine .On 0.10.14 version it failed .

google-ml-butler[bot] commented 1 month ago

Are you satisfied with the resolution of your issue? Yes No

KosuriSireesha commented 1 month ago

Reopened the issue .

kuaashish commented 1 month ago

Hi @PaulTR,

Could you please look into this issue? Currently, we do not have a real device to reproduce it. Additionally, we are not sure if the issue is specific to a particular device. It seems that the sample app is also behaving the same way based on the input received.

Thank you!!