pytorch / android-demo-app

PyTorch android examples of usage in applications
1.48k stars 608 forks source link

Force stopping app. #16

Open sunn-e opened 5 years ago

sunn-e commented 5 years ago

Downladed the assets and tried running on android device. Log: java.lang.RuntimeException: Unable to start activity ComponentInfo{org.pytorch.helloworld/org.pytorch.helloworld.MainActivity}: com.facebook.jni.CppException: false CHECK FAILED at ../torch/csrc/jit/import.cpp (deserialize at ../torch/csrc/jit/import.cpp:178) (no backtrace available) at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2946) at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3081) at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:78) at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:108) at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:68) at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1831) at android.os.Handler.dispatchMessage(Handler.java:106) at android.os.Looper.loop(Looper.java:201) at android.app.ActivityThread.main(ActivityThread.java:6806) at java.lang.reflect.Method.invoke(Native Method) at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:547) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:873) Caused by: com.facebook.jni.CppException: false CHECK FAILED at ../torch/csrc/jit/import.cpp (deserialize at ../torch/csrc/jit/import.cpp:178) (no backtrace available) at org.pytorch.Module$NativePeer.initHybrid(Native Method) at org.pytorch.Module$NativePeer.<init>(Module.java:70) at org.pytorch.Module.<init>(Module.java:25) at org.pytorch.Module.load(Module.java:21) at org.pytorch.helloworld.MainActivity.onCreate(MainActivity.java:39) at android.app.Activity.performCreate(Activity.java:7224) at android.app.Activity.performCreate(Activity.java:7213) at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1272) at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2926) ... 11 more

IvanKobzarev commented 5 years ago

@sunn-e This error usually happens when model was serialized and interpreted using different versions of libtorch. Did it happen with the models from this repository or you retraced/rescripted 'model.pt' as in instructions?

If you traced it and use gradle dependencies 'org.pytorch:pytorch_android:1.3.0' please check that your python torch version is 1.3.0:

└─ $ python -c 'import torch; print(torch.version.__version__)'
1.3.0
jiayong commented 5 years ago

thx for your reply

Yeongtae commented 5 years ago

@IvanKobzarev Although I use 1.3.0 to convert my customized model, I have seen the same error message on the android studio.

image image

Are there any unsupported layers or data types? such as fp16 layers, ... or is there any other checklist to solve it?

dev-michael-schmidt commented 5 years ago

Getting the same error.

torch==1.3.0 torchvision==0.4.1

data is just in torch.float

x_train = x_train.to(device=device, dtype=torch.float)
y_train = y_train.to(device=device, dtype=torch.float)
x_test = x_test.to(device=device, dtype=torch.float)
y_test = y_test.to(device=device, dtype=torch.float)

model is pretty standard:

class Model(torch.nn.Module):

    def __init__(self):
        super(Model, self).__init__()

        self.input_1 = torch.nn.Linear(n_features, layer_1_size)
        self.prelu_1 = torch.nn.PReLU()
        self.hidden_2 = torch.nn.Linear(layer_1_size, layer_2_size)
        self.prelu_2 = torch.nn.PReLU()
        self.hidden_3 = torch.nn.Linear(layer_2_size, layer_3_size)
        self.prelu_3 = torch.nn.PReLU()
        self.out_4 = torch.nn.Linear(layer_3_size, n_classes)

        self.drop = torch.nn.Dropout(0.25)

    def forward(self, x):
        x = self.prelu_1(self.input_1(x))
        x = self.drop(x)
        x = self.prelu_2(self.hidden_2(x))
        x = self.drop(x)
        x = self.prelu_3(self.hidden_3(x))
        x = self.drop(x)
        x = self.out_4(x)

        return x

and model export is based on the example, ensuring dtype=torch.float

model.eval()
example = torch.rand(1, num_features, dtype=torch.float)
traced_script_module = torch.jit.trace(model, example)
traced_script_module.save("../xxx/app/src/main/assets/model.pt")
Yeongtae commented 5 years ago

@IvanKobzarev Although I use 1.3.0 to convert my customized model, I have seen the same error message on the android studio.

image image

Are there any unsupported layers or data types? such as fp16 layers, ... or is there any other checklist to solve it?

When I use the file name 'asr.pt', it makes the same error. But It's fine with the other file name. image

antonlebedjko commented 5 years ago

Same error for me and my python torch version is 1.3.0

dev-michael-schmidt commented 5 years ago

Again 1.3.0 for both pytorch_android, pytorch

2019-11-08 14:05:49.886 9870-9870/com.dummy.app W/com.dummy.app: Got a deoptimization request on un-deoptimizable method com.facebook.jni.HybridData org.pytorch.Module$NativePeer.initHybrid(java.lang.String)
2019-11-08 14:05:51.531 9870-9870/com.dummy.app D/AndroidRuntime: Shutting down VM
2019-11-08 14:05:51.542 9870-9870/com.dummy.app E/AndroidRuntime: FATAL EXCEPTION: main
    Process: com.dummy.app, PID: 9870
    java.lang.RuntimeException: Unable to start activity ComponentInfo{com.dummy.app/com.dummy.app.MainActivity}: com.facebook.jni.CppException: false CHECK FAILED at aten/src/ATen/Functions.h (empty at aten/src/ATen/Functions.h:3535)
    (no backtrace available)
        at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3270)
        at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3409)
        at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:83)
        at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
        at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
        at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2016)
        at android.os.Handler.dispatchMessage(Handler.java:107)
        at android.os.Looper.loop(Looper.java:214)
        at android.app.ActivityThread.main(ActivityThread.java:7356)
        at java.lang.reflect.Method.invoke(Native Method)
        at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:492)
        at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:930)
     Caused by: com.facebook.jni.CppException: false CHECK FAILED at aten/src/ATen/Functions.h (empty at aten/src/ATen/Functions.h:3535)
    (no backtrace available)
        at org.pytorch.Module$NativePeer.initHybrid(Native Method)
        at org.pytorch.Module$NativePeer.<init>(Module.java:70)
        at org.pytorch.Module.<init>(Module.java:25)
        at org.pytorch.Module.load(Module.java:21)
        at com.dummy.app.MainActivity.onCreate(MainActivity.java:27)
        at android.app.Activity.performCreate(Activity.java:7802)
        at android.app.Activity.performCreate(Activity.java:7791)
        at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1299)
        at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3245)
            ... 11 more
dev-michael-schmidt commented 5 years ago

Using:

#include <torch/script.h> // One-stop header.

#include <iostream>
#include <memory>

int main(int argc, const char* argv[]) {

  if (argc != 2) {
    std::cerr << "usage: example-app <path-to-exported-script-module>\n";
    return -1;
  }

  torch::jit::script::Module module;
  try {
    // Deserialize the ScriptModule from a file using torch::jit::load().
    module = torch::jit::load(argv[1]);
  }
  catch (const c10::Error& e) {
    std::cerr << "error loading the model\n";
    return -1;
  }

  std::cout << "ok\n";
}

I am able to get

./example-app ../../MyApplication/app/src/main/assets/real_model.pt
ok
josecyn commented 5 years ago

Hi there,

I had the same problem and, in my case, I solved it.

I created the pytorch traced model using the provided tracing script. The output filename was traced_model.pt. But when I added it to the assets folder of the Android HelloWorld demo I was changing the filename to model_2.pth. In that scenario the app was crashing with the stacktrace above.

Leaving the filename traced_model.pt as upon creation, solved the issue. Strange, huh?

peterzhang2029 commented 5 years ago

Hi there,

I had the same problem and, in my case, I solved it.

I created the pytorch traced model using the provided tracing script. The output filename was traced_model.pt. But when I added it to the assets folder of the Android HelloWorld demo I was changing the filename to model_2.pth. In that scenario the app was crashing with the stacktrace above.

Leaving the filename traced_model.pt as upon creation, solved the issue. Strange, huh?

I also have the same problem and try this method, but this way does not always work for me. So strange..

IvanKobzarev commented 5 years ago

@josecyn , @Yeongtae Sorry for my late reply.

My guess, that changing behavior on asset renaming could be with some not reuploading on device/emulator the latest assets. Do you have the same problem if you fully uninstall - install the application on device/emulator after renaming?

Sometimes I had an issue that adb install apk.apk did not reinstall native libraries, even if gradle dependencies were updated and java part was reinstalled on device. In that cases manual uninstall of app helped me.

IvanKobzarev commented 5 years ago

Hello @MichaelSchmidt82 Sorry for my late reply.

I checked your model on the latest nightly builds and it worked ok for me, it loads and forward() works for me. Is that error still happens for you with the latest pytorch android nightlies? (You might need to retrace/script your model with the latest python nightlies to have aligned with pytorch_android version of pt file) (We had several major fixes since you reported the issue)

To use nightlies (to force refresh dependencies gradle has argument --refresh-dependencies)

repositories {
    maven {
        url "https://oss.sonatype.org/content/repositories/snapshots"
    }
}

dependencies {
    ...
    implementation 'org.pytorch:pytorch_android:1.4.0-SNAPSHOT'
    implementation 'org.pytorch:pytorch_android_torchvision:1.4.0-SNAPSHOT'
    ...
}
dev-michael-schmidt commented 5 years ago

I will try it later.

On Wed, Nov 27, 2019, 6:52 PM Ivan Kobzarev notifications@github.com wrote:

Hello @MichaelSchmidt82 https://github.com/MichaelSchmidt82 Sorry for my late reply.

I checked your model on the latest nightly builds and it worked ok for me, it loads and forward() works for me. Is that error still happens for you with the latest pytorch android nightlies? (You might need to retrace/script your model with the latest python nightlies to have aligned with pytorch_android version of pt file) (We had several major fixes since you reported the issue)

To use nightlies (to force refresh dependencies gradle has argument --refresh-dependencies)

repositories { maven { url "https://oss.sonatype.org/content/repositories/snapshots" } }

dependencies { ... implementation 'org.pytorch:pytorch_android:1.4.0-SNAPSHOT' implementation 'org.pytorch:pytorch_android_torchvision:1.4.0-SNAPSHOT' ... }

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pytorch/android-demo-app/issues/16?email_source=notifications&email_token=ABLJ6IUBPTUQOQZIX43NXKDQV4B4ZA5CNFSM4I7XWFQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFLBFIY#issuecomment-559288995, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLJ6IXGHXSFKY5W65YJFALQV4B4ZANCNFSM4I7XWFQA .

hcflrl commented 4 years ago

@IvanKobzarev i use 1.4.0-SNAPSHOT, but i meet "A/libc: Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR)" when loadding module

paramsen commented 4 years ago

FYI; We ran into this exception which halted us for a day, eventually we came to the conclusion that it worked with our locally trained models and not our cloud trained models, because our cloud trained ones have CUDA enabled. Disabled CUDA and voilà, works again.

CUDA incompatibility should have a clear exception message, same goes for all the pytorch mobile exceptions I've had so far :)

dreiss commented 4 years ago

Missing detail in error messages is a known problem in the 1.3 release. Check out the just-released PyTorch 1.4!

andreybicalho commented 4 years ago

FYI; We ran into this exception which halted us for a day, eventually we came to the conclusion that it worked with our locally trained models and not our cloud trained models, because our cloud trained ones have CUDA enabled. Disabled CUDA and voilà, works again.

CUDA incompatibility should have a clear exception message, same goes for all the pytorch mobile exceptions I've had so far :)

I was trying to serialize it to load on mobile, and your answer helped. In case somebody needs it, just send your model to CPU and then use torchscript to serialize it, like so: cpu_model = gpu_model.cpu() sample_input_cpu = sample_input_gpu.cpu() traced_cpu = torch.jit.trace(traced_cpu, sample_input_cpu) torch.jit.save(traced_cpu, "cpu.pth")

ref: https://pytorch.org/docs/master/jit.html#creating-torchscript-code

dingusagar commented 3 years ago

I fixed my issue by changing the path in trace_model.py from app/src/main/assets/model.pt to app/src/main/assets/model.pth