google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
26.92k stars 5.1k forks source link

Getting gibberish output using phi-2 when building Mediapipe from source #5480

Closed AhSinan closed 1 month ago

AhSinan commented 3 months ago

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

None

OS Platform and Distribution

Android 14

Mobile device if the issue happens on mobile device

Pixel 8 pro

Browser and version if the issue happens on browser

N/A

Programming Language and version

C++/Java

MediaPipe version

v0.10.14

Bazel version

6.1.1

Solution

MediaPipe

Android Studio, NDK, SDK versions (if issue is related to building in Android environment)

NDK r25c

Xcode & Tulsi version (if issue is related to building for iOS)

N/A

Describe the actual behavior

Running phi-2 after conversion (with my build from source) is producing gibberish output.

Describe the expected behaviour

Running phi-2 after conversion (with my build from source) should produce non-gibberish output.

Standalone code/steps you may have used to try to get what you need

I have used this colab notebook to convert phi-2 model to mediapipe format:
https://colab.research.google.com/github/googlesamples/mediapipe/blob/main/examples/llm_inference/conversion/llm_conversion.ipynb

Then I have built mediapipe + XNNPACK from soruce using v0.10.14 commit, I have used the following command:

$ bazel build -c opt \
    --config=android_arm64 \
    --strip=ALWAYS \
    --legacy_whole_archive=0 \
    --features=-legacy_whole_archive \
    --copt=-fvisibility=hidden \
    --copt=-ffunction-sections \
    --copt=-fdata-sections \
    --copt=-fstack-protector \
    --copt=-Oz \
    --copt=-fomit-frame-pointer \
    --copt=-DABSL_MIN_LOG_LEVEL=2 \
    --linkopt=-Wl,--gc-sections,--strip-all \
    //mediapipe/tasks/java/com/google/mediapipe/tasks/genai:tasks_genai.aar

Did the same for tasks_core.aar

Then, I have used Mediapipe LLM sample app to test this: https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/llm_inference/android

When running phi-2 with my build I am getting gibberish output.

It is worth noting that Gemma model works fine after conversion with my build, I am facing this issue with the other models (e.g. phi-2). For some reason, when I use the released AARs from Maven, all models works fine, the issue happens when I build MediaPipe+XNNPACK from source.


### Other info / Complete Logs

```shell
The current output I am getting with my build from source:

Ġ=ĊTheĠlatestĠnewsĠandĠresearchĠonĠtheĠglobalĠeconomyĊĊInĠthisĠarticle,ĠweĠwillĠdiscussĠtheĠimpactĠofĠtheĠCOVID-19ĠpandemicĠonĠtheĠglobalĠeconomy.ĊTheĠglobalĠeconomyĠisĠexpectedĠtoĠshrinkĠbyĠ4.32%ĠinĠ2020,ĠandĠitĠisĠprojectedĠtoĠrecoverĠtoĠpre-pandemicĠlevelsĠbyĠ2023.ĊTheĠglobalĠeconomyĠisĠtheĠsumĠofĠallĠtheĠeconomicĠactivitiesĠthatĠtakeĠplaceĠwithinĠaĠcountry'sĠborders.ĠItĠisĠaĠcountry'sĠGrossĠDomesticĠProductĠ(GDP)ĠandĠisĠaĠmeasureĠofĠtheĠvalueĠofĠallĠtheĠ
kuaashish commented 2 months ago

Hi @AhSinan,

Apologies for the delayed response. Could you please let us know if this has been resolved on your end, or if you are still seeking a resolution? If you are still looking for a solution, could you also let us know if you are testing this on a real device or an emulator?

Thank you!!

github-actions[bot] commented 1 month ago

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 1 month ago

This issue was closed due to lack of activity after being marked stale for past 7 days.

google-ml-butler[bot] commented 1 month ago

Are you satisfied with the resolution of your issue? Yes No