Open sunn-e opened 5 years ago
@sunn-e This error usually happens when model was serialized and interpreted using different versions of libtorch. Did it happen with the models from this repository or you retraced/rescripted 'model.pt' as in instructions?
If you traced it and use gradle dependencies 'org.pytorch:pytorch_android:1.3.0' please check that your python torch version is 1.3.0:
└─ $ python -c 'import torch; print(torch.version.__version__)'
1.3.0
thx for your reply
@IvanKobzarev Although I use 1.3.0 to convert my customized model, I have seen the same error message on the android studio.
Are there any unsupported layers or data types? such as fp16 layers, ... or is there any other checklist to solve it?
Getting the same error.
torch==1.3.0 torchvision==0.4.1
data is just in torch.float
x_train = x_train.to(device=device, dtype=torch.float)
y_train = y_train.to(device=device, dtype=torch.float)
x_test = x_test.to(device=device, dtype=torch.float)
y_test = y_test.to(device=device, dtype=torch.float)
model is pretty standard:
class Model(torch.nn.Module):
def __init__(self):
super(Model, self).__init__()
self.input_1 = torch.nn.Linear(n_features, layer_1_size)
self.prelu_1 = torch.nn.PReLU()
self.hidden_2 = torch.nn.Linear(layer_1_size, layer_2_size)
self.prelu_2 = torch.nn.PReLU()
self.hidden_3 = torch.nn.Linear(layer_2_size, layer_3_size)
self.prelu_3 = torch.nn.PReLU()
self.out_4 = torch.nn.Linear(layer_3_size, n_classes)
self.drop = torch.nn.Dropout(0.25)
def forward(self, x):
x = self.prelu_1(self.input_1(x))
x = self.drop(x)
x = self.prelu_2(self.hidden_2(x))
x = self.drop(x)
x = self.prelu_3(self.hidden_3(x))
x = self.drop(x)
x = self.out_4(x)
return x
and model export is based on the example, ensuring dtype=torch.float
model.eval()
example = torch.rand(1, num_features, dtype=torch.float)
traced_script_module = torch.jit.trace(model, example)
traced_script_module.save("../xxx/app/src/main/assets/model.pt")
@IvanKobzarev Although I use 1.3.0 to convert my customized model, I have seen the same error message on the android studio.
Are there any unsupported layers or data types? such as fp16 layers, ... or is there any other checklist to solve it?
When I use the file name 'asr.pt', it makes the same error. But It's fine with the other file name.
Same error for me and my python torch version is 1.3.0
Again 1.3.0 for both pytorch_android, pytorch
2019-11-08 14:05:49.886 9870-9870/com.dummy.app W/com.dummy.app: Got a deoptimization request on un-deoptimizable method com.facebook.jni.HybridData org.pytorch.Module$NativePeer.initHybrid(java.lang.String)
2019-11-08 14:05:51.531 9870-9870/com.dummy.app D/AndroidRuntime: Shutting down VM
2019-11-08 14:05:51.542 9870-9870/com.dummy.app E/AndroidRuntime: FATAL EXCEPTION: main
Process: com.dummy.app, PID: 9870
java.lang.RuntimeException: Unable to start activity ComponentInfo{com.dummy.app/com.dummy.app.MainActivity}: com.facebook.jni.CppException: false CHECK FAILED at aten/src/ATen/Functions.h (empty at aten/src/ATen/Functions.h:3535)
(no backtrace available)
at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3270)
at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3409)
at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:83)
at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2016)
at android.os.Handler.dispatchMessage(Handler.java:107)
at android.os.Looper.loop(Looper.java:214)
at android.app.ActivityThread.main(ActivityThread.java:7356)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:492)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:930)
Caused by: com.facebook.jni.CppException: false CHECK FAILED at aten/src/ATen/Functions.h (empty at aten/src/ATen/Functions.h:3535)
(no backtrace available)
at org.pytorch.Module$NativePeer.initHybrid(Native Method)
at org.pytorch.Module$NativePeer.<init>(Module.java:70)
at org.pytorch.Module.<init>(Module.java:25)
at org.pytorch.Module.load(Module.java:21)
at com.dummy.app.MainActivity.onCreate(MainActivity.java:27)
at android.app.Activity.performCreate(Activity.java:7802)
at android.app.Activity.performCreate(Activity.java:7791)
at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1299)
at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3245)
... 11 more
Using:
#include <torch/script.h> // One-stop header.
#include <iostream>
#include <memory>
int main(int argc, const char* argv[]) {
if (argc != 2) {
std::cerr << "usage: example-app <path-to-exported-script-module>\n";
return -1;
}
torch::jit::script::Module module;
try {
// Deserialize the ScriptModule from a file using torch::jit::load().
module = torch::jit::load(argv[1]);
}
catch (const c10::Error& e) {
std::cerr << "error loading the model\n";
return -1;
}
std::cout << "ok\n";
}
I am able to get
./example-app ../../MyApplication/app/src/main/assets/real_model.pt
ok
Hi there,
I had the same problem and, in my case, I solved it.
I created the pytorch traced model using the provided tracing script. The output filename was traced_model.pt
.
But when I added it to the assets folder of the Android HelloWorld demo I was changing the filename to model_2.pth
. In that scenario the app was crashing with the stacktrace above.
Leaving the filename traced_model.pt
as upon creation, solved the issue.
Strange, huh?
Hi there,
I had the same problem and, in my case, I solved it.
I created the pytorch traced model using the provided tracing script. The output filename was
traced_model.pt
. But when I added it to the assets folder of the Android HelloWorld demo I was changing the filename tomodel_2.pth
. In that scenario the app was crashing with the stacktrace above.Leaving the filename
traced_model.pt
as upon creation, solved the issue. Strange, huh?
I also have the same problem and try this method, but this way does not always work for me. So strange..
@josecyn , @Yeongtae Sorry for my late reply.
My guess, that changing behavior on asset renaming could be with some not reuploading on device/emulator the latest assets. Do you have the same problem if you fully uninstall - install the application on device/emulator after renaming?
Sometimes I had an issue that adb install apk.apk
did not reinstall native libraries, even if gradle dependencies were updated and java part was reinstalled on device. In that cases manual uninstall of app helped me.
Hello @MichaelSchmidt82 Sorry for my late reply.
I checked your model on the latest nightly builds and it worked ok for me, it loads and forward()
works for me.
Is that error still happens for you with the latest pytorch android nightlies?
(You might need to retrace/script your model with the latest python nightlies to have aligned with pytorch_android version of pt file)
(We had several major fixes since you reported the issue)
To use nightlies (to force refresh dependencies gradle has argument --refresh-dependencies
)
repositories {
maven {
url "https://oss.sonatype.org/content/repositories/snapshots"
}
}
dependencies {
...
implementation 'org.pytorch:pytorch_android:1.4.0-SNAPSHOT'
implementation 'org.pytorch:pytorch_android_torchvision:1.4.0-SNAPSHOT'
...
}
I will try it later.
On Wed, Nov 27, 2019, 6:52 PM Ivan Kobzarev notifications@github.com wrote:
Hello @MichaelSchmidt82 https://github.com/MichaelSchmidt82 Sorry for my late reply.
I checked your model on the latest nightly builds and it worked ok for me, it loads and forward() works for me. Is that error still happens for you with the latest pytorch android nightlies? (You might need to retrace/script your model with the latest python nightlies to have aligned with pytorch_android version of pt file) (We had several major fixes since you reported the issue)
To use nightlies (to force refresh dependencies gradle has argument --refresh-dependencies)
repositories { maven { url "https://oss.sonatype.org/content/repositories/snapshots" } }
dependencies { ... implementation 'org.pytorch:pytorch_android:1.4.0-SNAPSHOT' implementation 'org.pytorch:pytorch_android_torchvision:1.4.0-SNAPSHOT' ... }
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pytorch/android-demo-app/issues/16?email_source=notifications&email_token=ABLJ6IUBPTUQOQZIX43NXKDQV4B4ZA5CNFSM4I7XWFQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFLBFIY#issuecomment-559288995, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLJ6IXGHXSFKY5W65YJFALQV4B4ZANCNFSM4I7XWFQA .
@IvanKobzarev i use 1.4.0-SNAPSHOT, but i meet "A/libc: Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR)" when loadding module
FYI; We ran into this exception which halted us for a day, eventually we came to the conclusion that it worked with our locally trained models and not our cloud trained models, because our cloud trained ones have CUDA enabled. Disabled CUDA and voilà, works again.
CUDA incompatibility should have a clear exception message, same goes for all the pytorch mobile exceptions I've had so far :)
Missing detail in error messages is a known problem in the 1.3 release. Check out the just-released PyTorch 1.4!
FYI; We ran into this exception which halted us for a day, eventually we came to the conclusion that it worked with our locally trained models and not our cloud trained models, because our cloud trained ones have CUDA enabled. Disabled CUDA and voilà, works again.
CUDA incompatibility should have a clear exception message, same goes for all the pytorch mobile exceptions I've had so far :)
I was trying to serialize it to load on mobile, and your answer helped. In case somebody needs it, just send your model to CPU and then use torchscript to serialize it, like so: cpu_model = gpu_model.cpu() sample_input_cpu = sample_input_gpu.cpu() traced_cpu = torch.jit.trace(traced_cpu, sample_input_cpu) torch.jit.save(traced_cpu, "cpu.pth")
ref: https://pytorch.org/docs/master/jit.html#creating-torchscript-code
I fixed my issue by changing the path in trace_model.py from app/src/main/assets/model.pt
to app/src/main/assets/model.pth
Downladed the assets and tried running on android device. Log:
java.lang.RuntimeException: Unable to start activity ComponentInfo{org.pytorch.helloworld/org.pytorch.helloworld.MainActivity}: com.facebook.jni.CppException: false CHECK FAILED at ../torch/csrc/jit/import.cpp (deserialize at ../torch/csrc/jit/import.cpp:178) (no backtrace available) at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2946) at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3081) at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:78) at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:108) at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:68) at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1831) at android.os.Handler.dispatchMessage(Handler.java:106) at android.os.Looper.loop(Looper.java:201) at android.app.ActivityThread.main(ActivityThread.java:6806) at java.lang.reflect.Method.invoke(Native Method) at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:547) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:873) Caused by: com.facebook.jni.CppException: false CHECK FAILED at ../torch/csrc/jit/import.cpp (deserialize at ../torch/csrc/jit/import.cpp:178) (no backtrace available) at org.pytorch.Module$NativePeer.initHybrid(Native Method) at org.pytorch.Module$NativePeer.<init>(Module.java:70) at org.pytorch.Module.<init>(Module.java:25) at org.pytorch.Module.load(Module.java:21) at org.pytorch.helloworld.MainActivity.onCreate(MainActivity.java:39) at android.app.Activity.performCreate(Activity.java:7224) at android.app.Activity.performCreate(Activity.java:7213) at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1272) at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2926) ... 11 more