Open zhangjh opened 1 year ago
Thanks for your reporting the issue.
Could you please share the model file here? Or a GoodDrive Link with Model? And the saved input_data/output_data as numpy format would be helpful to address this issue.
Add my build.gradle configuration below:
plugins {
id 'com.android.application'
}
android {
namespace 'me.zhangjh.smart.search'
compileSdk 33
defaultConfig {
applicationId "me.zhangjh.smart.search"
minSdk 26
targetSdk 33
versionCode 1
versionName "1.0"
resConfigs "zh","en"
ndk {
abiFilters 'armeabi-v7a', 'arm64-v8a'
}
testInstrumentationRunner "androidx.test.runner.AndroidJUnitRunner"
}
buildTypes {
release {
shrinkResources true
minifyEnabled true
proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'
}
}
compileOptions {
sourceCompatibility JavaVersion.VERSION_1_9
targetCompatibility JavaVersion.VERSION_1_9
}
// dexOptions {
// dexInProcess true
// preDexLibraries true
// javaMaxHeapSize "6g"
// }
testOptions {
unitTests.includeAndroidResources = true
unitTests.all {
useJUnitPlatform()
}
}
packagingOptions {
exclude 'META-INF/DEPENDENCIES'
exclude 'META-INF/NOTICE'
exclude 'META-INF/LICENSE'
exclude 'META-INF/LICENSE.txt'
exclude 'META-INF/NOTICE.txt'
exclude 'META-INF/native-image/linux-x86/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/linux-x86_64/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/linux-ppc64le/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/linux-arm64/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/linux-armhf/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/android-x86_64/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/android-x86/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/android-arm/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/android-arm64/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/ios-x86_64/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/ios-arm64/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/windows-x86_64/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/windows-x86/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/macosx-arm64/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/macosx-x86_64/jnijavacpp/jni-config.json'
exclude 'META-INF/native-image/linux-x86/jnijavacpp/reflect-config.json'
exclude 'META-INF/native-image/linux-ppc64le/jnijavacpp/reflect-config.json'
exclude 'META-INF/native-image/linux-x86_64/jnijavacpp/reflect-config.json'
exclude 'META-INF/native-image/linux-arm64/jnijavacpp/reflect-config.json'
exclude 'META-INF/native-image/linux-armhf/jnijavacpp/reflect-config.json'
exclude 'META-INF/native-image/android-x86_64/jnijavacpp/reflect-config.json'
exclude 'META-INF/native-image/android-x86/jnijavacpp/reflect-config.json'
exclude 'META-INF/native-image/android-arm/jnijavacpp/reflect-config.json'
exclude 'META-INF/native-image/android-arm64/jnijavacpp/reflect-config.json'
exclude 'META-INF/native-image/ios-x86_64/jnijavacpp/reflect-config.json'
exclude 'META-INF/native-image/ios-arm64/jnijavacpp/reflect-config.json'
exclude 'META-INF/native-image/windows-x86_64/jnijavacpp/reflect-config.json'
exclude 'META-INF/native-image/windows-x86/jnijavacpp/reflect-config.json'
exclude 'META-INF/native-image/macosx-x86_64/jnijavacpp/reflect-config.json'
exclude 'META-INF/native-image/macosx-arm64/jnijavacpp/reflect-config.json'
}
}
dependencies {
implementation 'androidx.appcompat:appcompat:1.6.1'
implementation 'com.google.android.material:material:1.9.0'
implementation 'androidx.constraintlayout:constraintlayout:2.1.4'
implementation 'com.microsoft.onnxruntime:onnxruntime-android:1.15.1'
implementation 'org.nd4j:nd4j-native:1.0.0-M2.1:macosx-arm64'
implementation 'org.nd4j:nd4j-native-platform:1.0.0-M2.1'
implementation group: 'com.alibaba', name: 'fastjson', version: '2.0.34'
androidTestImplementation 'junit:junit:4.13.2'
androidTestImplementation 'androidx.test:runner:1.5.2'
androidTestImplementation 'androidx.test.ext:junit:1.1.5'
testImplementation 'junit:junit:4.13.2'
testImplementation("org.junit.jupiter:junit-jupiter-api:5.7.0")
}
core dependencies 'nd4j' and 'onnx' version are the same as in java project.
I also encountered the same issue with chinese-clip, I had check the input tensor and the model byte[] in memory (android) to make sure they are same as (java), but the result is always different,(the java inference result is right), it confused me a long time, searched Google high and low but could not find the answer.
any update on this @zhangjh, @wejoncy ?
Same here. But I found that, converting the onnx
format to ort
, and using *.with_runtime_opt.ort
version, may bridge the result gap at a bit (though the difference are still observed, but acceptable).
And I also observed that the quantized model may yield this problem while the original model would not.
I've forgotten how to solve the issue cause it passed a long time. Maybe we can overlook this issue because I found that it worked well in the Android environment, however I wonder why there are differences between Android and Java environments.
Hi @zhangjh @greyovo @Young-Flash I checked each op's output carefully, It's as the specialized optimization in different platform (Linux x86-64, Android a64/a32). The error is accumulated by layers. Does the error affect the final output in your end2end senario?
A simple solutions for now is to use NNAPI, which gives the same outputs as your python use-case. But it might be a bit slow.
If it ends up producing totally different outputs such as classification/detection, we will try to treak the Matmul Algorithm to improve the precision.
I checked each op's output carefully, It's as the specialized optimization in different platform (Linux x86-64, Android a64/a32). The error is accumulated by layers.
is it Involve all the op in the model, or just a specify one?
Does the error affect the final output in your end2end senario?
yeah it does affect the final output in the end2end senario (text image match), make the result unacceptable.
If it ends up producing totally different outputs such as classification/detection, we will try to treak the Matmul Algorithm to improve the precision.
thanks in advance.
is it Involve all the op in the model, or just a specify one?
MatMulInteger
is the culprit.
Potential saturation
might be the root cause.
Please use NNAPI EP to work around it temporarily if it's urgent for your scenario.
Will update this if we figure out the solution.
Describe the issue
I tried using onnxruntime to run chinese-clip model on android enviroment. Because i am familiar with java, So I written the code in java first. And the result is as the same as the official python version. When I transfer the code to android studio, I found that the result is different. I hava already checked the input tensor, code dependencies, all that same. Below is my core code fragments about onnx running.
the same input tensors: (saved the data and checked with vimdiff)
different outputs:
And the core dependencies are the same:
Any thoughts or comments on this problem?
To reproduce
See above for details
Urgency
No response
Platform
Android
OS Version
android studio emulator arm64-v8a
ONNX Runtime Installation
Released Package
Compiler Version (if 'Built from Source')
No response
Package Name (if 'Released Package')
onnxruntime-android
ONNX Runtime Version or Commit ID
1.15.0
ONNX Runtime API
Java/Kotlin
Architecture
ARM64
Execution Provider
Default CPU
Execution Provider Library Version
No response