alphacep / vosk-android-demo

Offline speech recognition for Android with Vosk library.
Apache License 2.0
740 stars 198 forks source link

V/KaldiDemo: Assertion failed: (indexes->row_offsets.size() == time_offsets_.size()) #20

Closed chamecall closed 4 years ago

chamecall commented 4 years ago

When I run Kaldi it works for some time but after a few seconds (5-10) it dumps the following error: V/KaldiDemo: Assertion failed: (indexes->row_offsets.size() == time_offsets_.size()) How can I go over it?

nshmyrev commented 4 years ago

Please provide full logcat output as well as phone model.

Please check the amount of free memory on your phone.

chamecall commented 4 years ago

there's free memory photo5253928520357358686

That's output:

01/13 20:23:52: Launching 'app' on Sony D5503. $ adb shell am start -n "fr.univavignon.alize.AndroidALIZEDemo/fr.univavignon.alize.AndroidALIZEDemo.MainActivity" -a android.intent.action.MAIN -c android.intent.category.LAUNCHER Connected to process 3198 on device 'sony-d5503-CB5A1Y4PZZ'. Capturing and displaying logcat messages from application. This behavior can be disabled in the "Logcat output" section of the "Debugger" settings page. W/art: Before Android 4.1, method android.graphics.PorterDuffColorFilter android.support.graphics.drawable.VectorDrawableCompat.updateTintFilter(android.graphics.PorterDuffColorFilter, android.content.res.ColorStateList, android.graphics.PorterDuff$Mode) would have incorrectly overridden the package-private method in android.graphics.drawable.Drawable W/linker: libalize-native.so: unused DT entry: type 0x6ffffffe arg 0x14fefc libalize-native.so: unused DT entry: type 0x6fffffff arg 0x3 D/OpenGLRenderer: Use EGL_SWAP_BEHAVIOR_PRESERVED: true D/Atlas: Validating map... I/Adreno-EGL: : EGL 1.4 QUALCOMM build: AU_LINUX_ANDROID_LA.BF.1.1.1_RB1.05.01.00.042.030_msm8974_LA.BF.1.1.1_RB1__release_AU () OpenGL ES Shader Compiler Version: E031.25.03.06 Build Date: 05/17/15 Sun Local Branch: mybranch10089422 Remote Branch: quic/LA.BF.1.1.1_rb1.22 Local Patches: NONE Reconstruct Branch: AU_LINUX_ANDROID_LA.BF.1.1.1_RB1.05.01.00.042.030 + 6151be1 + NOTHING I/OpenGLRenderer: Initialized EGL, version 1.4 D/OpenGLRenderer: Enabling debug mode 0 I/Timeline: Timeline: Activity_idle id: android.os.BinderProxy@39c22f5f time:899263618 I/Timeline: Timeline: Activity_launch_request id:fr.univavignon.alize.AndroidALIZEDemo time:899265791 W/linker: libkaldi_jni.so: unused DT entry: type 0x6ffffef5 arg 0x189ee4 libkaldi_jni.so: unused DT entry: type 0x6ffffffe arg 0x1be248 libkaldi_jni.so: unused DT entry: type 0x6fffffff arg 0x3 I/Assets: Skipping asset model-android/ivector/online_cmvn.conf: checksums are equal I/Assets: Skipping asset model-android/Gr.fst: checksums are equal I/Assets: Skipping asset model-android/ivector/final.ie: checksums are equal Skipping asset model-android/ivector/final.dubm: checksums are equal I/Assets: Skipping asset model-android/ivector/splice.conf: checksums are equal Skipping asset model-android/ivector/final.mat: checksums are equal I/Assets: Skipping asset model-android/final.mdl: checksums are equal Skipping asset model-android/ivector/global_cmvn.stats: checksums are equal I/Assets: Skipping asset model-android/HCLr.fst: checksums are equal I/Assets: Skipping asset model-android/words.txt: checksums are equal I/Assets: Skipping asset model-android/disambig_tid.int: checksums are equal I/Assets: Skipping asset model-android/mfcc.conf: checksums are equal I/Assets: Skipping asset model-android/word_boundary.int: checksums are equal D/!!!!: /storage/emulated/0/Android/data/fr.univavignon.alize.AndroidALIZEDemo/files/sync D/OpenGLRenderer: endAllStagingAnimators on 0xb49bc600 (RippleDrawable) with handle 0xaedde800 I/Timeline: Timeline: Activity_idle id: android.os.BinderProxy@14eee06b time:899265997 V/KaldiDemo: Computing derived variables for iVector extractor V/KaldiDemo: Done. V/KaldiDemo: Removed 1 orphan nodes. Removing 2 orphan components. Added 1 components, removed 2 V/KaldiDemo: Spent 0.263713 seconds in looped compilation. I/System.out: FINAL RESULT IS {"result" : [ {"word": "one", "start" : 1.32, "end" : 1.68, "conf" : 1} ], "text" : "one" } I/System.out: FINAL RESULT IS {"result" : [ {"word": "one", "start" : 3.03, "end" : 3.45, "conf" : 1}, {"word": "ten", "start" : 4.5, "end" : 4.77, "conf" : 0.597119} ], "text" : "one ten" } I/System.out: FINAL RESULT IS {"result" : [ {"word": "sorry", "start" : 5.88, "end" : 6.21, "conf" : 1} ], "text" : "sorry" } I/System.out: FINAL RESULT IS {"result" : [ {"word": "five", "start" : 7.95, "end" : 8.49, "conf" : 1} ], "text" : "five" } I/System.out: FINAL RESULT IS {"result" : [ {"word": "seven", "start" : 10.92, "end" : 11.46, "conf" : 0.996388} ], "text" : "seven" } I/System.out: FINAL RESULT IS {"result" : [ {"word": "eight", "start" : 12.87, "end" : 13.29, "conf" : 0.999997} ], "text" : "eight" } I/System.out: FINAL RESULT IS {"result" : [ {"word": "nine", "start" : 15.45, "end" : 15.93, "conf" : 1} ], "text" : "nine" } I/System.out: FINAL RESULT IS {"result" : [ {"word": "ten", "start" : 18.18, "end" : 18.63, "conf" : 1} ], "text" : "ten" } I/System.out: FINAL RESULT IS {"result" : [ ], "text" : "" } I/System.out: FINAL RESULT IS {"result" : [ {"word": "thirteen", "start" : 28.5, "end" : 29.19, "conf" : 1} ], "text" : "thirteen" } A/libc: Fatal signal 11 (SIGSEGV), code 1, fault addr 0x0 in tid 4190 (AudioRecorder T) Process 3198 terminated.

chamecall commented 4 years ago

In addition: demo app is works great, but I used the same code in another project even with such code structure and I got what I have..

chamecall commented 4 years ago

now I've got the new message:

...
I/System.out: FINAL RESULT IS {"result" : [ {"word": "and", "start" : 27.96, "end" : 28.11, "conf" : 0.999979},
    {"word": "gain", "start" : 28.11, "end" : 28.53, "conf" : 0.995035}
     ], "text" : "and gain" }
V/KaldiDemo: No node named 'ivector'in network.
A/libc: /usr/local/google/buildbot/src/android/ndk-release-r20/external/libcxx/../../external/libcxxabi/src/abort_message.cpp:73: abort_message: assertion "terminating with uncaught exception of type kaldi::KaldiFatalError: kaldi::KaldiFatalError" failed
    Fatal signal 6 (SIGABRT), code -6 in tid 11757 (AudioRecorder T)
Process 9634 terminated.

and such one:

V/KaldiDemo: Assertion failed: ((trans == kNoTrans && M.NumCols() == v.dim_ && M.NumRows() == dim_) || (trans == kTrans && M.NumRows() == v.dim_ && M.NumCols() == dim_))
A/libc: Fatal signal 6 (SIGABRT), code -6 in tid 14208 (AudioRecorder T)

All i've done is increased AudioRecord buffer size when creating it.

nshmyrev commented 4 years ago

I'm talking about RAM usage, not SDCARD usage. It simply goes out of memory in your case.

If you want speaker identification, alize is a bad idea, it is better to use native kaldi models and methods.

chamecall commented 4 years ago

I'm talking about RAM usage, not SDCARD usage. It simply goes out of memory in your case.

But how is is possible if it works fine with your demo having the same structure?

nshmyrev commented 4 years ago

Alize uses enormous amount of memory. You can check https://stackoverflow.com/questions/34566061/androidstudio-show-usage-of-ram for details too.

chamecall commented 4 years ago

Alize uses enormous amount of memory. You can check https://stackoverflow.com/questions/34566061/androidstudio-show-usage-of-ram for details too.

I just commented Alize instance creating but I again bump into that error.. I just committed current state of the project: here. Can you take a look into it?

nshmyrev commented 4 years ago

Sure, but show me the memory graph first.

chamecall commented 4 years ago

Sure, but show me the memory graph first.

In those 10-30 seconds, when it still works?

nshmyrev commented 4 years ago

Yes, you can show me adb shell dumpsys meminfo <appid> output for example before it crashes.

chamecall commented 4 years ago

Yes, you can show me adb shell dumpsys meminfo <appid> output for example before it crashes.

Applications Memory Usage (kB):
Uptime: 914613827 Realtime: 1639887479

** MEMINFO in pid 18526 [fr.univavignon.alize.AndroidALIZEDemo] **
                   Pss  Private  Private  Swapped     Heap     Heap     Heap
                 Total    Dirty    Clean    Dirty     Size    Alloc     Free
                ------   ------   ------   ------   ------   ------   ------
  Native Heap    57351    57340        0      844   102400    66626    35773
  Dalvik Heap     1715     1684        0     3976     9804     6041     3763
 Dalvik Other      256      256        0        0                           
        Stack      176      176        0        0                           
       Ashmem        6        0        0        0                           
      Gfx dev     1472     1472        0        0                           
    Other dev        5        0        4        0                           
     .so mmap     4646      428     3876     1244                           
    .apk mmap      237        0      132        0                           
    .ttf mmap       56        0       56        0                           
    .dex mmap     1284        0     1280        0                           
    .oat mmap      631        0      140        0                           
    .art mmap     1788      556      784        0                           
   Other mmap        4        4        0        0                           
   EGL mtrack    16672    16672        0        0                           
      Unknown      360      360        0       52                           
        TOTAL    86659    78948     6272     6116   112204    72667    39536

 Objects
               Views:       24         ViewRootImpl:        2
         AppContexts:        4           Activities:        2
              Assets:        4        AssetManagers:        4
       Local Binders:        8        Proxy Binders:       15
       Parcel memory:        3         Parcel count:       12
    Death Recipients:        0      OpenSSL Sockets:        0

 SQL
         MEMORY_USED:        0
  PAGECACHE_OVERFLOW:        0          MALLOC_SIZE:        0

Could it help you?

chamecall commented 4 years ago

Yes, you can show me adb shell dumpsys meminfo <appid> output for example before it crashes.

Also memory allocation from before the crash from Android Studio Selection_001

chamecall commented 4 years ago

In addition from a LogCat also before crash:

01-14 17:42:51.580 6144-6965/? A/libc: Fatal signal 6 (SIGABRT), code -6 in tid 6965 (KaldiThread)
01-14 17:42:51.575 9361-9361/? W/debuggerd_real: type=1400 audit(0.0:56535): avc: denied { ptrace } for scontext=u:r:init:s0 tcontext=u:r:untrusted_app:s0 tclass=process op_res=0 ppid=9258 pcomm="debuggerd" tgid=9258 tgcomm="debuggerd"
01-14 17:42:51.675 9361-9361/? W/debuggerd_real: type=1400 audit(0.0:56536): avc: denied { sigstop } for scontext=u:r:init:s0 tcontext=u:r:untrusted_app:s0 tclass=process op_res=0 ppid=9258 pcomm="debuggerd" tgid=9258 tgcomm="debuggerd"
01-14 17:42:51.688 9361-9361/? D/clmlib: Got activities:0x0000000E
01-14 17:42:51.689 9361-9361/? I/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
01-14 17:42:51.689 9361-9361/? I/DEBUG: UUID: 6ad24a81-058b-4469-a1d3-1daaf04804d8
01-14 17:42:51.689 9361-9361/? I/DEBUG: Build fingerprint: 'Sony/D5503/D5503:5.1.1/14.6.A.1.236/2031203603:user/release-keys'
01-14 17:42:51.689 9361-9361/? I/DEBUG: Revision: '0'
01-14 17:42:51.690 9361-9361/? I/DEBUG: ABI: 'arm'
01-14 17:42:51.690 9361-9361/? I/DEBUG: pid: 6144, tid: 6965, name: KaldiThread  >>> fr.univavignon.alize.AndroidALIZEDemo <<<
01-14 17:42:51.690 9361-9361/? I/DEBUG: signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
01-14 17:42:51.730 9361-9361/? I/DEBUG:     r0 00000000  r1 00001b35  r2 00000006  r3 00000000
01-14 17:42:51.730 9361-9361/? I/DEBUG:     r4 a0ef0db8  r5 00000006  r6 0000000b  r7 0000010c
01-14 17:42:51.730 9361-9361/? I/DEBUG:     r8 00000001  r9 a2b08adb  sl a2b08d30  fp a0ef04f0
01-14 17:42:51.730 9361-9361/? I/DEBUG:     ip 00001b35  sp a0ef03d0  lr b6db48a9  pc b6dd8608  cpsr 600f0010
01-14 17:42:51.731 9361-9361/? I/DEBUG: backtrace:
01-14 17:42:51.731 9361-9361/? I/DEBUG:     #00 pc 00038608  /system/lib/libc.so (tgkill+12)
01-14 17:42:51.731 9361-9361/? I/DEBUG:     #01 pc 000148a5  /system/lib/libc.so (pthread_kill+52)
01-14 17:42:51.731 9361-9361/? I/DEBUG:     #02 pc 000154c3  /system/lib/libc.so (raise+10)
01-14 17:42:51.732 9361-9361/? I/DEBUG:     #03 pc 00011c09  /system/lib/libc.so (__libc_android_abort+36)
01-14 17:42:51.732 9361-9361/? I/DEBUG:     #04 pc 0001007c  /system/lib/libc.so (abort+4)
01-14 17:42:51.732 9361-9361/? I/DEBUG:     #05 pc 005b0788  /data/app/fr.univavignon.alize.AndroidALIZEDemo-2/lib/arm/libkaldi_jni.so (kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)+380)
01-14 17:42:51.733 9361-9361/? I/DEBUG:     #06 pc 003577f0  /data/app/fr.univavignon.alize.AndroidALIZEDemo-2/lib/arm/libkaldi_jni.so (kaldi::nnet3::TdnnComponent::Propagate(kaldi::nnet3::ComponentPrecomputedIndexes const*, kaldi::CuMatrixBase<float> const&, kaldi::CuMatrixBase<float>*) const+984)
01-14 17:42:51.733 9361-9361/? I/DEBUG:     #07 pc 00339f40  /data/app/fr.univavignon.alize.AndroidALIZEDemo-2/lib/arm/libkaldi_jni.so (kaldi::nnet3::NnetComputer::ExecuteCommand()+776)
01-14 17:42:51.733 9361-9361/? I/DEBUG:     #08 pc 0033b330  /data/app/fr.univavignon.alize.AndroidALIZEDemo-2/lib/arm/libkaldi_jni.so (kaldi::nnet3::NnetComputer::Run()+224)
01-14 17:42:51.734 9361-9361/? I/DEBUG:     #09 pc 0034e2b4  /data/app/fr.univavignon.alize.AndroidALIZEDemo-2/lib/arm/libkaldi_jni.so (kaldi::nnet3::DecodableNnetLoopedOnlineBase::AdvanceChunk()+808)
01-14 17:42:51.734 9361-9361/? I/DEBUG:     #10 pc 0034e750  /data/app/fr.univavignon.alize.AndroidALIZEDemo-2/lib/arm/libkaldi_jni.so (kaldi::nnet3::DecodableAmNnetLoopedOnline::LogLikelihood(int, int)+56)
01-14 17:42:51.734 9361-9361/? I/DEBUG:     #11 pc 002661a8  /data/app/fr.univavignon.alize.AndroidALIZEDemo-2/lib/arm/libkaldi_jni.so (kaldi::LatticeFasterDecoderTpl<fst::Fst<fst::ArcTpl<fst::TropicalWeightTpl<float> > >, kaldi::decoder::BackpointerToken>::ProcessEmitting(kaldi::DecodableInterface*)+796)
01-14 17:42:51.735 9361-9361/? I/DEBUG:     #12 pc 00263314  /data/app/fr.univavignon.alize.AndroidALIZEDemo-2/lib/arm/libkaldi_jni.so (kaldi::LatticeFasterDecoderTpl<fst::Fst<fst::ArcTpl<fst::TropicalWeightTpl<float> > >, kaldi::decoder::BackpointerToken>::AdvanceDecoding(kaldi::DecodableInterface*, int)+340)
01-14 17:42:51.735 9361-9361/? I/DEBUG:     #13 pc 001fd273  /data/app/fr.univavignon.alize.AndroidALIZEDemo-2/lib/arm/libkaldi_jni.so (KaldiRecognizer::AcceptWaveform(char const*, int)+162)
01-14 17:42:51.736 9361-9361/? I/DEBUG:     #14 pc 001fca57  /data/app/fr.univavignon.alize.AndroidALIZEDemo-2/lib/arm/libkaldi_jni.so (Java_org_kaldi_voskJNI_KaldiRecognizer_1AcceptWaveform+38)
01-14 17:42:51.736 9361-9361/? I/DEBUG:     #15 pc 002c81ef  /data/dalvik-cache/arm/data@app@fr.univavignon.alize.AndroidALIZEDemo-2@base.apk@classes.dex
01-14 17:42:51.988 374-6178/? W/AudioFlinger: RecordThread: buffer overflow
01-14 17:42:52.250 9361-9361/? I/DEBUG: Tombstone written to: /data/tombstones/tombstone_00
01-14 17:42:52.245 9361-9361/? W/debuggerd_real: type=1400 audit(0.0:56537): avc: denied { signal } for scontext=u:r:init:s0 tcontext=u:r:untrusted_app:s0 tclass=process op_res=0 ppid=9258 pcomm="debuggerd" tgid=9258 tgcomm="debuggerd"
01-14 17:42:52.258 1056-1117/? I/BootReceiver: Copying /data/tombstones/tombstone_00 to DropBox (SYSTEM_TOMBSTONE)
01-14 17:42:52.276 1056-8049/? D/ActivityManager: New dropbox entry: fr.univavignon.alize.AndroidALIZEDemo, data_app_native_crash, 65bd8620-c326-4b8d-9ad8-3cd74c03af10
01-14 17:42:52.276 1056-8049/? W/ActivityManager:   Force finishing activity 1 fr.univavignon.alize.AndroidALIZEDemo/.SpeechScanningActivity
01-14 17:42:52.296 1056-2771/? I/WindowState: WIN DEATH: Window{3d451d2b u0 fr.univavignon.alize.AndroidALIZEDemo/fr.univavignon.alize.AndroidALIZEDemo.SpeechScanningActivity}
01-14 17:42:52.313 374-6178/? D/audio_hw_primary: in_standby: enter: stream (0xa9521b00) usecase(7: audio-record)
01-14 17:42:52.318 396-396/? I/Zygote: Process 6144 exited due to signal (6)
01-14 17:42:52.357 1056-2512/? I/ActivityManager: Process fr.univavignon.alize.AndroidALIZEDemo (pid 6144) has died
01-14 17:42:52.360 374-6178/? D/hardware_info: hw_info_append_hw_type : device_name = handset-mic-asr
01-14 17:42:52.363 374-2398/? D/audio_hw_primary: adev_close_input_stream: enter:stream_handle(0xa9521b00)
01-14 17:42:52.363 374-2398/? D/audio_hw_primary: in_standby: enter: stream (0xa9521b00) usecase(7: audio-record)
nshmyrev commented 4 years ago

Ok, I figured that out. Model variable shouldn't be local in setup task, it contains data. If you make it local, it is garbage collected soon and everything crashes.

Make model a field in the activity, so it stays alive with the activity, then everything will be fine.

I will create a separate issue about it.

chamecall commented 4 years ago

Ok, I figured that out. Model variable shouldn't be local in setup task, it contains data. If you make it local, it is garbage collected soon and everything crashes.

Make model a field in the activity, so it stays alive with the activity, then everything will be fine.

I will create a separate issue about it.

You saved my day! Actually two days) Can you tell me how exactly did you find out that?

chamecall commented 4 years ago

I'm talking about RAM usage, not SDCARD usage. It simply goes out of memory in your case.

If you want speaker identification, alize is a bad idea, it is better to use native kaldi models and methods.

I'd very appreciate if you could point out some examples of how to implement this (kaldi speaker-id) in android or like that..

chamecall commented 4 years ago

I'm talking about RAM usage, not SDCARD usage. It simply goes out of memory in your case. If you want speaker identification, alize is a bad idea, it is better to use native kaldi models and methods.

I'd very appreciate if you could point out some examples of how to implement this (kaldi speaker-id) in android or like that..

I'm sorry.. Have you read this?

nshmyrev commented 4 years ago

I have plans to implement it, but no strict deadline yet.

Overall, you just need to take kaldi C++ api and wrap it the same way as ASR API, then you have to put the models on the phone together with ASR models and run them.

chamecall commented 4 years ago

I have plans to implement it, but no strict deadline yet.

Overall, you just need to take kaldi C++ api and wrap it the same way as ASR API, then you have to put the models on the phone together with ASR models and run them.

Maybe you can point out from what point I should start working on it?

nshmyrev commented 4 years ago

Well, you can start with training voxceleb models which run reasonably fast in kaldi scripts on the desktop. That would be the first step. You need to play with the number of parameters in xvector extractor.

chamecall commented 4 years ago

Well, you can start with training voxceleb models which run reasonably fast in kaldi scripts on the desktop. That would be the first step. You need to play with the number of parameters in xvector extractor.

first of all have I to make the first recipe (v1) or the second one (v2) in here this recipe ?

And what do you mean under params of xvector extractor? Something from that?

nj=40 cmd="run.pl" stage=0 norm_vars=false center=true compress=true cmn_window=300

nshmyrev commented 4 years ago

You need v2. Network parameters are configured in this script https://github.com/kaldi-asr/kaldi/blob/2b3e7641e7f3cb5298dff1133517a8b1dd3c1ac3/egs/voxceleb/v1/local/nnet3/xvector/tuning/run_xvector_1a.sh#L97

chamecall commented 4 years ago

You need v2. Network parameters are configured in this script https://github.com/kaldi-asr/kaldi/blob/2b3e7641e7f3cb5298dff1133517a8b1dd3c1ac3/egs/voxceleb/v1/local/nnet3/xvector/tuning/run_xvector_1a.sh#L97

I hope we're still talking 'bout a deploying the model under Android in the end. Then among specified by you parameters I could change 'dim' parameter for any of tdnn[1..7] layer? Could I?

chamecall commented 4 years ago

I found pre-trained model in here. For sample I'd like to export it under Android. What are my next steps?

chamecall commented 4 years ago

Also.. To train a model: have I to compile Kaldi first under Android as said in here or I can do it with desktop version of Kaldi and then use trained model with such Android version?

nshmyrev commented 4 years ago

@chamecall speaker id landed in vosk api:

https://github.com/alphacep/vosk-api/blob/master/python/example/test_local_speaker.py https://github.com/alphacep/vosk-api/blob/master/java/test/DecoderTest.java