Kitt-AI / snowboy

Future versions with model training module will be maintained through a forked version here: https://github.com/seasalt-ai/snowboy
Other
3.08k stars 997 forks source link

Android demo app crash #40

Closed WksKing closed 7 years ago

WksKing commented 8 years ago

I write a demo tool to use "hotword detection" function, but I encountered a crash when create a SnowboyDetect object, error msg is:

W/art (11495): Before Android 4.1, method android.graphics.PorterDuffColorFilter android.support.graphics.drawable.VectorDrawableCompat.updateTintFilter(android.graphics.PorterDuffColorFilter, android.content.res.ColorStateList, android.graphics.PorterDuff$Mode) would have incorrectly overridden the package-private method in android.graphics.drawable.Drawable E/art (11495): No implementation found for long ai.kitt.snowboy.snowboyJNI.new_SnowboyDetect(java.lang.String, java.lang.String) (tried Java_ai_kitt_snowboy_snowboyJNI_new_1SnowboyDetect and Java_ai_kitt_snowboy_snowboyJNI_new_1SnowboyDetectLjava_lang_String_2Ljava_lang_String_2) D/AndroidRuntime(11495): Shutting down VM E/AndroidRuntime(11495): FATAL EXCEPTION: main E/AndroidRuntime(11495): Process: com.wistron.demo.tool.snowboy_demo, PID: 11495 E/AndroidRuntime(11495): java.lang.UnsatisfiedLinkError: No implementation found for long ai.kitt.snowboy.snowboyJNI.new_SnowboyDetect(java.lang.String, java.lang.String) (tried Java_ai_kitt_snowboy_snowboyJNI_new_1SnowboyDetect and Java_ai_kitt_snowboy_snowboyJNI_new_1SnowboyDetectLjava_lang_String_2Ljava_lang_String_2) E/AndroidRuntime(11495): at ai.kitt.snowboy.snowboyJNI.new_SnowboyDetect(Native Method) E/AndroidRuntime(11495): at ai.kitt.snowboy.SnowboyDetect.(SnowboyDetect.java:39) E/AndroidRuntime(11495): at com.wistron.demo.tool.snowboy_demo.MainActivity.onCreate(MainActivity.java:16) E/AndroidRuntime(11495): at android.app.Activity.performCreate(Activity.java:5933) E/AndroidRuntime(11495): at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1105) E/AndroidRuntime(11495): at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2251) E/AndroidRuntime(11495): at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2358) E/AndroidRuntime(11495): at android.app.ActivityThread.access$800(ActivityThread.java:144) E/AndroidRuntime(11495): at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1278) E/AndroidRuntime(11495): at android.os.Handler.dispatchMessage(Handler.java:102) E/AndroidRuntime(11495): at android.os.Looper.loop(Looper.java:135) E/AndroidRuntime(11495): at android.app.ActivityThread.main(ActivityThread.java:5219) E/AndroidRuntime(11495): at java.lang.reflect.Method.invoke(Native Method) E/AndroidRuntime(11495): at java.lang.reflect.Method.invoke(Method.java:372) E/AndroidRuntime(11495): at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:898) E/AndroidRuntime(11495): at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:693) W/ActivityManager( 828): Force finishing activity com.wistron.demo.tool.snowboy_demo/.MainActivity


My code snippet as below:

Java code:

    // Assume you put the model related files under /sdcard/snowboy/
    SnowboyDetect snowboyDetector = new SnowboyDetect("/storage/emulated/legacy/common.res",
            "/storage/emulated/legacy/snowboy.umdl");
    snowboyDetector.SetSensitivity("0.45");         // Sensitivity for each hotword
    snowboyDetector.SetAudioGain(2.0f);              // Audio gain for detection
    short[] buffer = new short[1024];
    int result = snowboyDetector.RunDetection(buffer, buffer.length);   // buffer is a short array.

I have pushed common.res and snowboy.umdl files to Android internal storage root folder. those files come from snowboy-master/resources/

AndroidManifest.xml

`<?xml version="1.0" encoding="utf-8"?> <manifest xmlns:android="http://schemas.android.com/apk/res/android" package="com.wistron.demo.tool.snowboy_demo">

<application android:allowBackup="true" android:icon="@mipmap/ic_launcher"
    android:label="@string/app_name" android:supportsRtl="true" android:theme="@style/AppTheme">
    <activity android:name=".MainActivity">
        <intent-filter>
            <action android:name="android.intent.action.MAIN" />
            <category android:name="android.intent.category.LAUNCHER" />
        </intent-filter>
    </activity>
</application>

`

thanks in advance!

WksKing commented 8 years ago

App crash issue has fixed: I add below codes in snowboyJNI.java

static { System.loadLibrary("snowboy-detect-android"); }

WksKing commented 8 years ago

Another question: I want this demo tool to keep recording/recognizing for user voice command until get a keyword "snowboy", how I need to do? the sample demo is too simple:

// Assume you put the model related files under /sdcard/snowboy/ SnowboyDetect snowboyDetector = new SnowboyDetect("/storage/emulated/legacy/common.res", "/storage/emulated/legacy/snowboy.umdl"); snowboyDetector.SetSensitivity("0.45"); // Sensitivity for each hotword snowboyDetector.SetAudioGain(2.0f); // Audio gain for detection short[] buffer = new short[1024]; int result = snowboyDetector.RunDetection(buffer, buffer.length); // buffer is a short array.

and the result is returned immediately, Microphone open/close is controlled by me or this library? any other public APIs are provided? current APIs contains:

public class snowboyJNI { static { System.loadLibrary("snowboy-detect-android"); } public final static native long new_SnowboyDetect(String jarg1, String jarg2); public final static native boolean SnowboyDetectReset(long jarg1, SnowboyDetect jarg1); public final static native int SnowboyDetect_RunDetectionSWIG0(long jarg1, SnowboyDetect jarg1, String jarg2); public final static native int SnowboyDetect_RunDetection__SWIG1(long jarg1, SnowboyDetect jarg1, float[] jarg2, int jarg3); public final static native int SnowboyDetect_RunDetectionSWIG2(long jarg1, SnowboyDetect jarg1, short[] jarg2, int jarg3); public final static native int SnowboyDetect_RunDetection__SWIG3(long jarg1, SnowboyDetect jarg1, int[] jarg2, int jarg3); public final static native void SnowboyDetectSetSensitivity(long jarg1, SnowboyDetect jarg1, String jarg2); public final static native String SnowboyDetectGetSensitivity(long jarg1, SnowboyDetect jarg1); public final static native void SnowboyDetectSetAudioGain(long jarg1, SnowboyDetect jarg1, float jarg2); public final static native void SnowboyDetectUpdateModel(long jarg1, SnowboyDetect jarg1); public final static native int SnowboyDetectNumHotwords(long jarg1, SnowboyDetect jarg1); public final static native int SnowboyDetectSampleRate(long jarg1, SnowboyDetect jarg1); public final static native int SnowboyDetectNumChannels(long jarg1, SnowboyDetect jarg1); public final static native int SnowboyDetectBitsPerSample(long jarg1, SnowboyDetect jarg1); public final static native void delete_SnowboyDetect(long jarg1); }

thanks.

Smanar commented 8 years ago

I don't know how swig is working, I m using the original C++ version, but I thing you can find what you are looking for here > https://github.com/Kitt-AI/snowboy/blob/master/examples/Python/snowboydecoder.py

And no, microphone open/close isn't controlled by the library. In this example, you are using Portaudio (with python) to make a "RingBuffer" (a loop sound). and every X s, you send this buffer to the lib to check if there is an hotword inside.

This library only check the buffer, it doesn't control sound.

chenguoguo commented 8 years ago

Thanks @Smanar for the awesome answer! Yes Snowboy doesn't control the microphone itself, it only reads the audio data (chunk by chunk) and returns a number indicating if the hotword has been detected. It general process will look like:

  1. Set up the microphone
  2. Send audio data chunk by chunk to Snowboy
  3. If a hotword is detected, you can then do whatever things you want to do. For example, in the C++ example it only prints out a message, but you can do a lot of other stuff, e.g., sending audio to server side speech recognizer.
  4. Go back to step 2.
WksKing commented 8 years ago

Thanks @Smanar @chenguoguo for your kindly help, I write a demo tool using snowboy library under your big support, you can find my demo project at: Android_snowboy_demo.

It works to detect the hotword, but it's too difficult to recognize, It can successfully detect one time after trying to test about 20 times. and I have also to "Create Hot Word" in webpage Snowboy dashboard, it also difficult to recognize in the last step to test.

Do you have any suggestions to improve it?

Orz...

chenguoguo commented 8 years ago

For the universal model "Snowboy" I also noticed that it performs worse on my Android phone than on my laptop. It's mostly caused by the mismatched microphones. The training data was collected on laptops, which might be very different from your Android phone's microphone. I played with the audio gain and sensitivity to make the detection work. You can also play with those parameters, e.g., increasing the sensitivity to something like 0.6 or 0.7 (of course this will increase the false alarm rate).

For the personal model created from Snowboy dashboard, you also have to make sure that you use the same microphone for recording and testing. What you can do is you use the microphone to record 3 audio samples on your Android device, you then submit those audio samples to the website. That should give you better performance since you now use the same microphone for training and testing.

WksKing commented 8 years ago

@chenguoguo, thank you. You're right, the success rate has been greatly increased (Sensitivity is 0.5, AudioGain is 1.0), pass rate is about 25 percent. maybe I need more test and adjust for the parameters.

if have new result, I will comment here.

thanks again.

chenguoguo commented 8 years ago

@WksKing you can further increase the sensitivity, it won't false alarm a lot. Try to play with parameters and see if you can get a detection rate of over 90%

alexandregarret commented 7 years ago

Hello @chenguoguo, could you tell me what is the common.res file ? Why this file is not generated when I download a new model via the webpage ?

Thank's Regards

chenguoguo commented 7 years ago

This file contains the resources we need for computation, it's different from the model, and it's included in the pre-compiled decoders. You can also find it in the github repository:

https://github.com/Kitt-AI/snowboy/blob/master/resources/common.res

alexandregarret commented 7 years ago

@chenguoguo Thank you for your answer, is it possible to have only one detector instance for 2 hotwords ? Or each detector is linked to only one hotword (in this case what is the return value '2', corresponding to Hotword '2' detected, used for ?)

thanks

chenguoguo commented 7 years ago

Yes that's doable.

You can follow this thread from the forum: https://groups.google.com/a/kitt.ai/forum/#!topic/snowboy-discussion/KMMkTd6ewWE

You can also check out the example for using two hotwords:https://github.com/Kitt-AI/snowboy/blob/master/examples/Python/demo2.py