tryolabs / TLSphinx

Swift wrapper around Pocketsphinx
MIT License
155 stars 58 forks source link

Device does not support required sample rate recording #24

Open ghost opened 8 years ago

ghost commented 8 years ago

2016-08-14 20:49:11.603 ACRCloudDemo_Swift[2332:76253] HER INFO: cmd_ln.c(697): Parsing command line: \ -hmm /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/en-us \ -lm /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/en-us.lm.dmp \ -dict /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/cmudict-en-us.dict

Current configuration: [NAME] [DEFLT] [VALUE] -agc none none -agcthresh 2.0 2.000000e+00 -allphone
-allphone_ci no no -alpha 0.97 9.700000e-01 -ascale 20.0 2.000000e+01 -aw 1 1 -backtrace no no -beam 1e-48 1.000000e-48 -bestpath yes yes -bestpathlw 9.5 9.500000e+00 -ceplen 13 13 -cmn current current -cmninit 8.0 8.0 -compallsen no no -debug 0 -dict /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/cmudict-en-us.dict -dictcase no no -dither no no -doublebw no no -ds 1 1 -fdict
-feat 1s_c_d_dd 1s_c_d_dd -featparams
-fillprob 1e-8 1.000000e-08 -frate 100 100 -fsg
-fsgusealtpron yes yes -fsgusefiller yes yes -fwdflat yes yes -fwdflatbeam 1e-64 1.000000e-64 -fwdflatefwid 4 4 -fwdflatlw 8.5 8.500000e+00 -fwdflatsfwin 25 25 -fwdflatwbeam 7e-29 7.000000e-29 -fwdtree yes yes -hmm /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/en-us -input_endian little little -jsgf
-keyphrase
-kws
-kws_plp 1e-1 1.000000e-01 -kws_threshold 1 1.000000e+00 -latsize 5000 5000 -lda
-ldadim 0 0 -lifter 0 0 -lm /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/en-us.lm.dmp -lmctl
-lmname
-logbase 1.0001 1.000100e+00 -logfn
-logspec no no -lowerf 133.33334 1.333333e+02 -lpbeam 1e-40 1.000000e-40 -lponlybeam 7e-29 7.000000e-29 -lw 6.5 6.500000e+00 -maxhmmpf 30000 30000 -maxwpf -1 -1 -mdef
-mean
-mfclogdir
-min_endfr 0 0 -mixw
-mixwfloor 0.0000001 1.000000e-07 -mllr
-mmap yes yes -ncep 13 13 -nfft 512 512 -nfilt 40 40 -nwpen 1.0 1.000000e+00 -pbeam 1e-48 1.000000e-48 -pip 1.0 1.000000e+00 -pl_beam 1e-10 1.000000e-10 -pl_pbeam 1e-10 1.000000e-10 -pl_pip 1.0 1.000000e+00 -pl_weight 3.0 3.000000e+00 -pl_window 5 5 -rawlogdir
-remove_dc no no -remove_noise yes yes -remove_silence yes yes -round_filters yes yes -samprate 16000 1.600000e+04 -seed -1 -1 -sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03 -smoothspec no no -svspec
-tmat
-tmatfloor 0.0001 1.000000e-04 -topn 4 4 -topn_beam 0 0 -toprule
-transform legacy legacy -unit_area yes yes -upperf 6855.4976 6.855498e+03 -uw 1.0 1.000000e+00 -vad_postspeech 50 50 -vad_prespeech 10 10 -vad_threshold 2.0 2.000000e+00 -var
-varfloor 0.0001 1.000000e-04 -varnorm no no -verbose no no -warp_params
-warp_type inverse_linear inverse_linear -wbeam 7e-29 7.000000e-29 -wip 0.65 6.500000e-01 -wlen 0.025625 2.562500e-02

INFO: cmd_ln.c(697): Parsing command line: \ -lowerf 130 \ -upperf 6800 \ -nfilt 25 \ -transform dct \ -lifter 22 \ -feat 1s_c_d_dd \ -svspec 0-12/13-25/26-38 \ -agc none \ -cmn current \ -varnorm no \ -model ptm \ -cmninit 40,3,-1

Current configuration: [NAME] [DEFLT] [VALUE] -agc none none -agcthresh 2.0 2.000000e+00 -alpha 0.97 9.700000e-01 -ceplen 13 13 -cmn current current -cmninit 8.0 40,3,-1 -dither no no -doublebw no no -feat 1s_c_d_dd 1s_c_d_dd -frate 100 100 -input_endian little little -lda
-ldadim 0 0 -lifter 0 22 -logspec no no -lowerf 133.33334 1.300000e+02 -ncep 13 13 -nfft 512 512 -nfilt 40 25 -remove_dc no no -remove_noise yes yes -remove_silence yes yes -round_filters yes yes -samprate 16000 1.600000e+04 -seed -1 -1 -smoothspec no no -svspec 0-12/13-25/26-38 -transform legacy dct -unit_area yes yes -upperf 6855.4976 6.800000e+03 -vad_postspeech 50 50 -vad_prespeech 10 10 -vad_threshold 2.0 2.000000e+00 -varnorm no no -verbose no no -warp_params
-warp_type inverse_linear inverse_linear -wlen 0.025625 2.562500e-02

INFO: acmod.c(252): Parsed model-specific feature parameters from /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/en-us/feat.params INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none' INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0 INFO: acmod.c(171): Using subvector specification 0-12/13-25/26-38 INFO: mdef.c(518): Reading model definition: /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/en-us/mdef INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file INFO: bin_mdef.c(336): Reading binary model definition: /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/en-us/mdef INFO: bin_mdef.c(516): 42 CI-phone, 137053 CD-phone, 3 emitstate/phone, 126 CI-sen, 5126 Sen, 29324 Sen-Seq INFO: tmat.c(206): Reading HMM transition probability matrices: /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/en-us/transition_matrices INFO: acmod.c(124): Attempting to use PTM computation module INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/en-us/means INFO: ms_gauden.c(292): 42 codebook, 3 feature, size: INFO: ms_gauden.c(294): 128x13 INFO: ms_gauden.c(294): 128x13 INFO: ms_gauden.c(294): 128x13 INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/en-us/variances INFO: ms_gauden.c(292): 42 codebook, 3 feature, size: INFO: ms_gauden.c(294): 128x13 INFO: ms_gauden.c(294): 128x13 INFO: ms_gauden.c(294): 128x13 INFO: ms_gauden.c(354): 222 variance values floored INFO: ptm_mgau.c(476): Loading senones from dump file /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/en-us/sendump INFO: ptm_mgau.c(500): BEGIN FILE FORMAT DESCRIPTION INFO: ptm_mgau.c(563): Rows: 128, Columns: 5126 INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones INFO: ptm_mgau.c(835): Maximum top-N: 4 INFO: phone_loop_search.c(115): State beam -225 Phone exit beam -225 Insertion penalty 0 INFO: dict.c(320): Allocating 137526 * 32 bytes (4297 KiB) for word entries INFO: dict.c(333): Reading main dictionary: /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/cmudict-en-us.dict INFO: dict.c(213): Allocated 1007 KiB for strings, 1662 KiB for phones INFO: dict.c(336): 133425 words read INFO: dict.c(342): Reading filler dictionary: /Users/Administrator/Library/Developer/CoreSimulator/Devices/8B673B41-2CE3-4E43-B848-3651BD36A0F9/data/Containers/Bundle/Application/CB5DFC36-5AA6-4C79-B1B7-90734EC00C58/ACRCloudDemo_Swift.app/en-us/en-us/noisedict INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones INFO: dict.c(345): 5 words read INFO: dict2pid.c(396): Building PID tables for dictionary INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones INFO: dict2pid.c(132): Allocated 42672 bytes (41 KiB) for word-final triphones INFO: dict2pid.c(196): Allocated 42672 bytes (41 KiB) for single-phone word triphones INFO: ngram_model_arpa.c(77): No \data\ mark in LM file INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file INFO: ngram_model_dmp.c(196): ngrams 1=19794, 2=1377200, 3=3178194 INFO: ngram_model_dmp.c(242): 19794 = LM.unigrams(+trailer) read INFO: ngram_model_dmp.c(288): 1377200 = LM.bigrams(+trailer) read INFO: ngram_model_dmp.c(314): 3178194 = LM.trigrams read INFO: ngram_model_dmp.c(339): 57155 = LM.prob2 entries read INFO: ngram_model_dmp.c(359): 10935 = LM.bo_wt2 entries read INFO: ngram_model_dmp.c(379): 34843 = LM.prob3 entries read INFO: ngram_model_dmp.c(407): 2690 = LM.tseg_base entries read INFO: ngram_model_dmp.c(463): 19794 = ascii word strings read INFO: ngram_search_fwdtree.c(99): 788 unique initial diphones INFO: ngram_search_fwdtree.c(148): 0 root, 0 non-root channels, 56 single-phone words INFO: ngram_search_fwdtree.c(186): Creating search tree INFO: ngram_search_fwdtree.c(192): before: 0 root, 0 non-root channels, 56 single-phone words INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 44782 INFO: ngram_search_fwdtree.c(339): after: 573 root, 44654 non-root channels, 47 single-phone words INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25 2016-08-14 20:49:12.787 ACRCloudDemo_Swift[2332:76253] 20:49:12.786 ERROR: AVAudioIONodeImpl.mm:784: SetOutputFormat: required condition is false: format.sampleRate == hwFormat.sampleRate 2016-08-14 20:49:12.802 ACRCloudDemo_Swift[2332:76253] * Terminating app due to uncaught exception 'com.apple.coreaudio.avfaudio', reason: 'required condition is false: format.sampleRate == hwFormat.sampleRate' * First throw call stack: ( 0 CoreFoundation 0x000000010d484d85 exceptionPreprocess + 165 1 libobjc.A.dylib 0x000000010f260deb objc_exception_throw + 48 2 CoreFoundation 0x000000010d484bea +[NSException raise:format:arguments:] + 106 3 libAVFAudio.dylib 0x00000001100bfff3 _Z19AVAE_RaiseExceptionP8NSStringz + 176 4 libAVFAudio.dylib 0x0000000110101aef _ZN17AVAudioIONodeImpl15SetOutputFormatEmP13AVAudioFormat + 533 5 libAVFAudio.dylib 0x00000001100d2ead _ZN18AVAudioEngineGraph8_ConnectEP19AVAudioNodeImplBaseS1_jjP13AVAudioFormat + 2027 6 libAVFAudio.dylib 0x00000001100d5df0 _ZN18AVAudioEngineGraph7ConnectEP11AVAudioNodeS1_mmP13AVAudioFormat + 322 7 libAVFAudio.dylib 0x0000000110108a71 _ZN17AVAudioEngineImpl7ConnectEP11AVAudioNodeS1_mmP13AVAudioFormat + 301 8 libAVFAudio.dylib 0x0000000110108ad8 -[AVAudioEngine connect:to:format:] + 83 9 ACRCloudDemo_Swift 0x000000010c4b7737 _TFC18ACRCloudDemo_Swift7Decoder19startDecodingSpeechfFGSqVS_10Hypothesis_TT + 1127 10 ACRCloudDemo_Swift 0x000000010c4aca45 _TFC18ACRCloudDemo_Swift14ViewController11viewDidLoadfTT + 2149 11 ACRCloudDemo_Swift 0x000000010c4ad892 _TToFC18ACRCloudDemo_Swift14ViewController11viewDidLoadfTT + 34 12 UIKit 0x000000010de47984 -[UIViewController loadViewIfRequired] + 1198 13 UIKit 0x000000010de47cd3 -[UIViewController view] + 27 14 UIKit 0x000000010dd1dfb4 -[UIWindow addRootViewControllerViewIfPossible] + 61 15 UIKit 0x000000010dd1e69d -[UIWindow _setHidden:forced:] + 282 16 UIKit 0x000000010dd30180 -[UIWindow makeKeyAndVisible] + 42 17 UIKit 0x000000010dca4ed9 -[UIApplication _callInitializationDelegatesForMainScene:transitionContext:] + 4131 18 UIKit 0x000000010dcab568 -[UIApplication _runWithMainScene:transitionContext:completion:] + 1769 19 UIKit 0x000000010dca8714 -[UIApplication workspaceDidEndTransaction:] + 188 20 FrontBoardServices 0x000000011216e8c8 __FBSSERIALQUEUE_IS_CALLING_OUT_TO_A_BLOCK + 24 21 FrontBoardServices 0x000000011216e741 -[FBSSerialQueue _performNext] + 178 22 FrontBoardServices 0x000000011216eaca -[FBSSerialQueue _performNextFromRunLoopSource] + 45 23 CoreFoundation 0x000000010d3aa301 CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION + 17 24 CoreFoundation 0x000000010d3a022c CFRunLoopDoSources0 + 556 25 CoreFoundation 0x000000010d39f6e3 CFRunLoopRun + 867 26 CoreFoundation 0x000000010d39f0f8 CFRunLoopRunSpecific + 488 27 UIKit 0x000000010dca7f21 -[UIApplication _run] + 402 28 UIKit 0x000000010dcacf09 UIApplicationMain + 171 29 ACRCloudDemo_Swift 0x000000010c4ba742 main + 114 30 libdyld.dylib 0x000000010fd5492d start + 1 ) libc++abi.dylib: terminating with uncaught exception of type NSException (lldb)

So, this is my output and a crash.... upon my app starting I call startDecoding speech... how can I get specific outputs in an organized way ???? help

ghost commented 8 years ago

bump

nshmyrev commented 8 years ago

Also asked at

http://stackoverflow.com/questions/39088320/tlsphinx-audio-recognition-issues-in-swift

ghost commented 8 years ago

Resolved. I had to change the CHANNEL to 2 instead of 1.

nshmyrev commented 8 years ago

I had another similar report recently, basically it crashes with

* Terminating app due to uncaught exception 'com.apple.coreaudio.avfaudio', reason: 'required condition is false: IsFormatSampleRateAndChannelCountValid(outputHWFormat)'

The offensive line is, once removed it works better https://github.com/tryolabs/TLSphinx/blob/master/TLSphinx/Decoder.swift#L203

cgamache commented 7 years ago

@qwill ... Tried 2 for channels, same error with my iPhone 6s ... @nshmyrev ... Removed the referenced line, no good there either.

I did some digging and whereas I couldn't find a true smoking gun, some Apple docs indicate in a footnote that iPhone 6s does not support a sample rate of 44100... It does support 48000. But even with that change, I still get errors:

ERROR: [0x1b20cac40] >avae> AVAudioIONodeImpl.mm:884: SetOutputFormat: required condition is false: format.sampleRate == hwFormat.sampleRate

And I've been digging around trying to find a way to get those to match-up. Any clues or assistance would be very welcome!

tar500 commented 7 years ago

@cgamache Same issue with another project using AVAudioEngine: required condition is false: format.sampleRate == hwFormat.sampleRate It works all the time, but start to fail when a special song is added to an unrelated AVAudioMix and played back via AVPlayer. It is blowing my mind, just a magic. Apple developer forums have a related thread, where Apple Staff confirm similar bugs existence https://forums.developer.apple.com/message/36184#36184

I'm really close to give up and get rid of AVAudioEngine in my project :(

cgamache commented 7 years ago

A quick update. My latest experiments attempting to fix this issue revolve around injecting a mixer node between the input and output nodes to alleviate the mismatches. No luck so far, but the AVAudioEngine documentation is less-than-exhaustive and there are plenty of permutations to try.

nshmyrev commented 7 years ago

Well, pocketsphinx can process stereo 48k, no issues. You need to add -nfft 2048 -samprate 48000. We can also introduce a resampler.

revolter commented 7 years ago

I am having the same crash though I don't use this library. I got these days 8 crashes on iPhone 6s Plus and one on iPhone 6s, but when testing locally on an iPhone 6s, I get no crash. This is the audio format:

<AVAudioFormat 0x170480000:  1 ch,  44100 Hz, Float32>
randydalrymple commented 5 years ago

Comment moved from Issue #54:

Note: I edited my app name to "XXXXXXX" for posting.

When running on my device (iPhone 6s/iOS 12.2), TLSphinx crashes with the following error:

2019-04-16 21:20:22.999246-0400 XXXXXXX[8288:5047903] [aurioc] 1029: failed: -10851 (enable 1, outf< 2 ch, 0 Hz, Float32, non-inter> inf< 2 ch, 0 Hz, Float32, non-inter>) 2019-04-16 21:20:23.023423-0400 XXXXXXX[8288:5047903] [avae] AVAEInternal.h:70:_AVAE_Check: required condition is false: [AVAudioEngineGraph.mm:2037:_Connect: (IsFormatSampleRateAndChannelCountValid(format))] 2019-04-16 21:20:23.023753-0400 XXXXXXX8288:5047903] Terminating app due to uncaught exception 'com.apple.coreaudio.avfaudio', reason: 'required condition is false: IsFormatSampleRateAndChannelCountValid(format)' First throw call stack: (0x1bc0d8518 0x1bb2b39f8 0x1bbff2148 0x1c1e99438 0x1c1e98a70 0x1c1ec8de4 0x1c1f40e9c 0x1c1f40f24 0x1049ea91c 0x10239f2ec 0x1023af7b4 0x1023a8e64 0x1023a8ec8 0x1e84d9230 0x1e7f82af8 0x1e7f82e18 0x1e7f81e84 0x1e851029c 0x1e85114c4 0x1e84f1534 0x1e85b77c0 0x1e85b9eec 0x1e85b311c 0x1bc06a2bc 0x1bc06a23c 0x1bc069b24 0x1bc064a60 0x1bc064354 0x1be26479c 0x1e84d7b68 0x1023f7c30 0x1bbb2a8e0) libc++abi.dylib: terminating with uncaught exception of type NSException (lldb)

However, it runs as expected on the simulator (same hardware/iOS). I single-stepped and found that the error is thrown at line 189 of Decoder.swift:

"engine.connect(input, to: mixer, format: input.outputFormat(forBus: 0))"

Nothing in the code looks amiss.

Is there a workaround available?

randydalrymple commented 5 years ago

After copying the code in Decoder.swift to my own project and debugging from there, I found that the crash at Decoder.swift l. 189 can be avoided by changing the AVAudioSession category from .playback to .playAndRecord at l. 173. However, I have no idea how to rebuild the TLSphinx module and incorporate it into my project (see Issue #57). Is there any procedure to do this?

BrunoBerisso commented 5 years ago

Oh! That's a bug... The category should never be .playback It will be great if you could submit a pull request for that change 😄

randydalrymple commented 5 years ago

I created a pull request, but I'm not sure I did it correctly (this is my first experience with the process).

randydalrymple commented 5 years ago

TLSphinx.Decoder now works successfully in my project. Very responsive and accurate with a small word dictionary. Thank you.

I still need to make changes to Decoder to set the AVAudioSession.Category outside of Decoder (see Issue #57).