unispeech / unimrcp

Open source cross-platform implementation of MRCP protocol
http://www.unimrcp.org
Apache License 2.0
379 stars 167 forks source link

Getting No NLSML data available on speech recognition input validation [AzureSR] #325

Open ackris opened 8 months ago

ackris commented 8 months ago

Hi Team,

We are getting no nlsml data available error when we had executed run bsr1.

2024-03-06 06:52:18:992359 [INFO]   Raise App MRCP Response umc-1 <9567c86c26324318>
2024-03-06 06:52:19:238253 [INFO]   Receive MRCPv2 Data 10.0.0.4:47574 <-> 10.0.0.4:1544 [154 bytes]
MRCP/2.0 154 RECOGNITION-COMPLETE 1 COMPLETE
Channel-Identifier: 9567c86c26324318@speechrecog
Completion-Cause: 006 recognizer-error
Waveform-Uri: 

2024-03-06 06:52:19:238341 [INFO]   Raise App MRCP Event umc-1 <9567c86c26324318>
2024-03-06 06:52:19:238404 [WARN]   No NLSML data available
2024-03-06 06:52:19:238448 [INFO]   Receive App Request umc-1 <9567c86c26324318> [1]
2024-03-06 06:52:19:238463 [INFO]   Terminate Session umc-1 <9567c86c26324318>
2024-03-06 06:52:19:238576 [INFO]   Remove Control Channel <9567c86c26324318@speechrecog> [0]
2024-03-06 06:52:19:238616 [INFO]   Close TCP/MRCPv2 Connection 10.0.0.4:47574 <-> 10.0.0.4:1544
2024-03-06 06:52:19:239188 [INFO]   Receive SIP Event [nua_r_bye] Status 200 OK [SIP-Agent-1]
2024-03-06 06:52:19:239386 [INFO]   Receive SIP Event [nua_i_state] Status 200 to BYE [SIP-Agent-1]
2024-03-06 06:52:19:239423 [NOTICE] SIP Call State umc-1 [terminated]
2024-03-06 06:52:19:239502 [INFO]   Session Terminated umc-1 <9567c86c26324318>
2024-03-06 06:52:19:241109 [INFO]   Close RTP Transmitter 10.0.0.4:4000 -> 10.0.0.4:5020 [s:12 o:3840]
2024-03-06 06:52:19:241152 [INFO]   Remove RTP Session 10.0.0.4:4000
2024-03-06 06:52:19:241220 [NOTICE] Destroy TCP/MRCPv2 Connection 10.0.0.4:47574 <-> 10.0.0.4:1544
2024-03-06 06:52:19:241248 [INFO]   Remove MRCP Handle umc-1 <9567c86c26324318>
2024-03-06 06:52:19:241257 [INFO]   Raise App Response umc-1 <9567c86c26324318> [1] SUCCESS [0]
2024-03-06 06:52:19:241329 [NOTICE] Destroy MRCP Handle umc-1

However, we are able to 200 Ok for DTMF on executing run bsr2.


2024-03-06 06:53:03:881693 [INFO]   Raise App MRCP Response umc-2 <846f2c75a1f34801>
2024-03-06 06:53:03:942430 [INFO]   Receive MRCPv2 Data 10.0.0.4:40564 <-> 10.0.0.4:1544 [113 bytes]
MRCP/2.0 113 START-OF-INPUT 1 IN-PROGRESS
Channel-Identifier: 846f2c75a1f34801@speechrecog
Input-Type: dtmf

2024-03-06 06:53:03:942519 [INFO]   Raise App MRCP Event umc-2 <846f2c75a1f34801>
2024-03-06 06:53:04:302730 [INFO]   Receive MRCPv2 Data 10.0.0.4:40564 <-> 10.0.0.4:1544 [378 bytes]
MRCP/2.0 378 RECOGNITION-COMPLETE 1 COMPLETE
Channel-Identifier: 846f2c75a1f34801@speechrecog
Completion-Cause: 000 success
Content-Type: application/x-nlsml
Content-Length: 193

<?xml version="1.0"?>
<result>
  <interpretation grammar="builtin:dtmf/digits" confidence="1">
    <instance>1234</instance>
    <input mode="dtmf">1 2 3 4</input>
  </interpretation>
</result>
2024-03-06 06:53:04:302865 [INFO]   Raise App MRCP Event umc-2 <846f2c75a1f34801>
2024-03-06 06:53:04:303056 [INFO]   Interpretation[0].confidence: 1.00
2024-03-06 06:53:04:303082 [INFO]   Interpretation[0].grammar: builtin:dtmf/digits
2024-03-06 06:53:04:303098 [INFO]   Interpretation[0].instance[0]: 1234
2024-03-06 06:53:04:303112 [INFO]   Interpretation[0].input: 1 2 3 4
2024-03-06 06:53:04:303119 [INFO]   Interpretation[0].input.mode: dtmf
2024-03-06 06:53:04:303126 [INFO]   Interpretation[0].input.confidence: 1.00
2024-03-06 06:53:04:303183 [INFO]   Receive App Request umc-2 <846f2c75a1f34801> [1]
2024-03-06 06:53:04:303203 [INFO]   Terminate Session umc-2 <846f2c75a1f34801>
2024-03-06 06:53:04:303301 [INFO]   Remove Control Channel <846f2c75a1f34801@speechrecog> [0]
2024-03-06 06:53:04:303335 [INFO]   Close TCP/MRCPv2 Connection 10.0.0.4:40564 <-> 10.0.0.4:1544
2024-03-06 06:53:04:303853 [INFO]   Receive SIP Event [nua_r_bye] Status 200 OK [SIP-Agent-1]
2024-03-06 06:53:04:303874 [INFO]   Receive SIP Event [nua_i_state] Status 200 to BYE [SIP-Agent-1]
2024-03-06 06:53:04:303883 [NOTICE] SIP Call State umc-2 [terminated]
2024-03-06 06:53:04:303977 [INFO]   Session Terminated umc-2 <846f2c75a1f34801>
2024-03-06 06:53:04:311096 [INFO]   Close RTP Transmitter 10.0.0.4:4002 -> 10.0.0.4:5022 [s:33 o:132]
2024-03-06 06:53:04:311139 [INFO]   Remove RTP Session 10.0.0.4:4002
2024-03-06 06:53:04:311269 [NOTICE] Destroy TCP/MRCPv2 Connection 10.0.0.4:40564 <-> 10.0.0.4:1544
2024-03-06 06:53:04:311367 [INFO]   Remove MRCP Handle umc-2 <846f2c75a1f34801>
2024-03-06 06:53:04:311394 [INFO]   Raise App Response umc-2 <846f2c75a1f34801> [1] SUCCESS [0]
2024-03-06 06:53:04:311436 [NOTICE] Destroy MRCP Handle umc-2

The below are the logs.

2024-03-06 06:53:44:772364 [NOTICE] Add Session <a43daf0874ea42f6>
2024-03-06 06:53:44:772377 [INFO]   Receive Offer 0x7f99a8004838 <a43daf0874ea42f6> [c:1 a:1 v:0]
2024-03-06 06:53:44:772386 [INFO]   Found MRCP Engine [Azure-SR-1] for Resource [speechrecog] 0x7f99a8004838 <a43daf0874ea42f6>
2024-03-06 06:53:44:772436 [INFO]   Add Pending Control Channel <a43daf0874ea42f6@speechrecog> [1]
2024-03-06 06:53:44:782085 [INFO]   Enable RTP Session 10.0.0.4:5024
2024-03-06 06:53:44:782111 [INFO]   Open RTP Receiver 10.0.0.4:5024 <- 10.0.0.4:4004 playout [50 ms] bounds [0 - 600 ms] adaptive [1] skew detection [1]
2024-03-06 06:53:44:782117 [INFO]   Media Path 0x7f99a8004838 Source->[PCMU/16000/1]->Decoder->[LPCM/16000/1]->Bridge->[LPCM/16000/1]->Sink
2024-03-06 06:53:44:782177 [INFO]   Open <a43daf0874ea42f6@azuresr>
2024-03-06 06:53:44:782197 [NOTICE] AZURESR Usage: 1/1/2
2024-03-06 06:53:44:782208 [NOTICE] Usage [umsazuresr] min [0] cur [1] max [1]
2024-03-06 06:53:44:782224 [INFO]   Send Answer 0x7f99a8004838 <a43daf0874ea42f6> [c:1 a:1 v:0] Status OK
2024-03-06 06:53:44:782239 [INFO]   Local SDP 0x7f99a8004838 <a43daf0874ea42f6>
v=0
o=UniMRCPServer 0 0 IN IP4 10.0.0.4
s=-
c=IN IP4 10.0.0.4
t=0 0
m=application 1544 TCP/MRCPv2 1
a=setup:passive
a=connection:new
a=channel:a43daf0874ea42f6@speechrecog
a=cmid:1
m=audio 5024 RTP/AVP 97 102
a=rtpmap:97 PCMU/16000
a=rtpmap:102 telephone-event/16000
a=fmtp:102 0-15
a=recvonly
a=ptime:20
a=mid:1

2024-03-06 06:53:44:782549 [INFO]   Receive SIP Event [nua_i_state] Status 200 OK [SIP-Agent-1]
2024-03-06 06:53:44:782560 [NOTICE] SIP Call State 0x7f99a8004838 [completed]
2024-03-06 06:53:44:782744 [INFO]   Receive SIP Event [nua_i_ack] Status 200 OK [SIP-Agent-1]
2024-03-06 06:53:44:782757 [INFO]   Receive SIP Event [nua_i_state] Status 200 OK [SIP-Agent-1]
2024-03-06 06:53:44:782761 [NOTICE] SIP Call State 0x7f99a8004838 [ready]
2024-03-06 06:53:44:782765 [INFO]   Receive SIP Event [nua_i_active] Status 200 Call active [SIP-Agent-1]
2024-03-06 06:53:44:783035 [NOTICE] Accepted TCP/MRCPv2 Connection 10.0.0.4:1544 <-> 10.0.0.4:44106
2024-03-06 06:53:44:791756 [INFO]   Receive MRCPv2 Data 10.0.0.4:1544 <-> 10.0.0.4:44106 [335 bytes]
MRCP/2.0 335 RECOGNIZE 1
Channel-Identifier: a43daf0874ea42f6@speechrecog
Content-Id: request1@form-level
Content-Type: text/uri-list
Cancel-If-Queue: false
No-Input-Timeout: 5000
Recognition-Timeout: 10000
Start-Input-Timers: true
Confidence-Threshold: 0.5
Save-Waveform: true
Content-Length: 25

builtin:speech/transcribe
2024-03-06 06:53:44:791823 [INFO]   Assign Control Channel <a43daf0874ea42f6@speechrecog> to Connection 10.0.0.4:1544 <-> 10.0.0.4:44106 [0] -> [1]
2024-03-06 06:53:44:791888 [INFO]   Process RECOGNIZE Request <a43daf0874ea42f6@speechrecog> [1]
2024-03-06 06:53:44:791955 [INFO]   Init Speech Detector: frame-duration=10 ms, frame-size=320, max-frame-count=1550, output-frame-count=20, vad-mode=2, noinput-timeout=5000 ms, input-timeout=10000 ms, start-timeout=50 ms, complete-timeout=1000 ms, incomplete-timeout=15000 ms, leading-silence=300 ms, trailing-silence=300 ms, interim-results=1, start-of-input=external <a43daf0874ea42f6>
2024-03-06 06:53:44:791970 [INFO]   Start No-Input Timer [5000 ms] <a43daf0874ea42f6>
2024-03-06 06:53:44:791974 [INFO]   Open Waveform File for Writing /opt/unimrcp/var/atomax-utterance-a43daf0874ea42f6-1-16-kHz.wav, sampling-rate [16000]
2024-03-06 06:53:44:792041 [INFO]   Set Subscription Profile [cognitive.subscription.key] <a43daf0874ea42f6@azuresr>
2024-03-06 06:53:44:792046 [INFO]   Set Recognition Mode [interactive] <a43daf0874ea42f6@azuresr>
2024-03-06 06:53:44:792089 [INFO]   Process RECOGNIZE Response <a43daf0874ea42f6@speechrecog> [1]
2024-03-06 06:53:44:792093 [INFO]   State Transition IDLE -> RECOGNIZING <a43daf0874ea42f6@speechrecog>
2024-03-06 06:53:44:792102 [INFO]   Initiate WS connection <a43daf0874ea42f6> guid [aca0dbae3cbb4195a4d3c8e14001070e] uri [https://eastus.stt.speech.microsoft.com/speech/recognition/interactive/cognitiveservices/v1?format=detailed&language=en-US]
2024-03-06 06:53:44:792110 [INFO]   Send MRCPv2 Data 10.0.0.4:1544 <-> 10.0.0.4:44106 [83 bytes]
MRCP/2.0 83 1 200 IN-PROGRESS
Channel-Identifier: a43daf0874ea42f6@speechrecog

2024-03-06 06:53:44:804328 [INFO]   WS connected <a43daf0874ea42f6>
2024-03-06 06:53:44:804342 [INFO]   WS upgrade guid [aca0dbae3cbb4195a4d3c8e14001070e] <a43daf0874ea42f6>
2024-03-06 06:53:44:815755 [INFO]   WS upgraded <a43daf0874ea42f6>
HTTP/1.1 101 Switching Protocols
Connection: Upgrade
Date: Wed, 06 Mar 2024 06:53:44 GMT
Server: Kestrel
Upgrade: websocket
Strict-Transport-Security: max-age=31536000; includeSubDomains
Sec-WebSocket-Accept: gvh9ZueaP4ZYkOBGy9Ch0gHvxmo=

2024-03-06 06:53:44:815800 [INFO]   Send WS msg [232 bytes] <a43daf0874ea42f6>
Path:speech.config
X-Timestamp:2024-03-06T06:53:44.815778
Content-Type:application/json; charset=utf-8

{"context":{"system":{"version":"2.0.12341"}},"device":{"manufacturer":"UniSpeech","model":"umsazuresr","version":"1.27.0"}}
2024-03-06 06:53:45:032092 [INFO]   Speech Detector State Transition NO-INPUT -> IN-PROGRESS [250 ms] <a43daf0874ea42f6>
2024-03-06 06:53:45:032148 [INFO]   Start Input Timer [10000 ms] <a43daf0874ea42f6>
2024-03-06 06:53:45:032403 [INFO]   Send WS msg [517 bytes] <a43daf0874ea42f6>
Path:speech.context
X-RequestId:ed1816c4fc674573a11f3f99cf99144c
X-Timestamp:2024-03-06T06:53:45.032369
Content-Type:application/json; charset=utf-8

{"languageId":{"languages":[],"onSuccess":{"action":"Recognize"},"onUnknown":{"action":"None"},"mode":"DetectAtAudioStart","Priority":"PrioritizeLatency"},"phraseDetection":{"onSuccess":{"action":"None"},"onUnknown":{"action":"None"}},"phraseOutput":{"interimResults":{"resultType":"Auto"},"phraseResults":{"resultType":"Always"}},"audio":{"streams":{"1":null}}}
2024-03-06 06:53:45:032571 [INFO]   Send WS bin msg [5930 bytes] <a43daf0874ea42f6>
Path:audio
X-RequestId:ed1816c4fc674573a11f3f99cf99144c
X-Timestamp:2024-03-06T06:53:45.032555
Content-Type:audio/x-wav

2024-03-06 06:53:45:036553 [INFO]   Received WS msg [186 bytes] <a43daf0874ea42f6>
X-RequestId:ed1816c4fc674573a11f3f99cf99144c
Path:turn.start
Content-Type:application/json; charset=utf-8

{
  "context": {
    "serviceTag": "3de1a8d6595c44e7a5cff086ae8bf10b"
  }
}
2024-03-06 06:53:45:037637 [INFO]   Received WS close [36 bytes] status code [1007] reason [Could not validate speech context.] <a43daf0874ea42f6>
2024-03-06 06:53:45:037678 [INFO]   Respond to WS close [36 bytes] status code [1007] reason [Could not validate speech context.] <a43daf0874ea42f6>
2024-03-06 06:53:45:037696 [INFO]   Close HTTP connection <a43daf0874ea42f6>
2024-03-06 06:53:45:037711 [INFO]   Received WS close [36 bytes] status code [1007] reason [Could not validate speech context.] <a43daf0874ea42f6>
2024-03-06 06:53:45:037722 [INFO]   Close HTTP connection <a43daf0874ea42f6>
2024-03-06 06:53:45:038010 [NOTICE] UTTM total file entries [16] timers [16]
2024-03-06 06:53:45:038115 [NOTICE] RDRM total file entries [16] timers [16]
2024-03-06 06:53:45:038150 [INFO]   Process RECOGNITION-COMPLETE Event <a43daf0874ea42f6@speechrecog> [1]
2024-03-06 06:53:45:038178 [INFO]   State Transition RECOGNIZING -> RECOGNIZED <a43daf0874ea42f6@speechrecog>
2024-03-06 06:53:45:038226 [INFO]   Send MRCPv2 Data 10.0.0.4:1544 <-> 10.0.0.4:44106 [154 bytes]
MRCP/2.0 154 RECOGNITION-COMPLETE 1 COMPLETE
Channel-Identifier: a43daf0874ea42f6@speechrecog
Completion-Cause: 006 recognizer-error
Waveform-Uri: 

2024-03-06 06:53:45:038775 [INFO]   TCP/MRCPv2 Peer Disconnected 10.0.0.4:1544 <-> 10.0.0.4:44106
2024-03-06 06:53:45:039041 [INFO]   Receive SIP Event [nua_i_bye] Status 200 Session Terminated [SIP-Agent-1]
2024-03-06 06:53:45:039065 [INFO]   Receive SIP Event [nua_i_state] Status 200 Session Terminated [SIP-Agent-1]
2024-03-06 06:53:45:039076 [NOTICE] SIP Call State 0x7f99a8004838 [terminated]
2024-03-06 06:53:45:039096 [INFO]   Receive SIP Event [nua_i_terminated] Status 200 Session Terminated [SIP-Agent-1]
2024-03-06 06:53:45:039117 [INFO]   Deactivate Session 0x7f99a8004838 <a43daf0874ea42f6>
2024-03-06 06:53:45:039128 [INFO]   Terminate Session 0x7f99a8004838 <a43daf0874ea42f6>
2024-03-06 06:53:45:039173 [INFO]   Remove Control Channel <a43daf0874ea42f6@speechrecog> [0]
2024-03-06 06:53:45:039178 [INFO]   Close <a43daf0874ea42f6@azuresr>
2024-03-06 06:53:45:039293 [INFO]   Close HTTP connection <a43daf0874ea42f6>
2024-03-06 06:53:45:039357 [NOTICE] AZURESR Usage: 0/1/2
2024-03-06 06:53:45:039363 [NOTICE] Usage [umsazuresr] min [0] cur [0] max [1]
2024-03-06 06:53:45:042183 [INFO]   Close RTP Receiver 10.0.0.4:5024 <- 10.0.0.4:4004 [r:12 l:0 j:70 p:50 d:0 i:0]
2024-03-06 06:53:45:042216 [INFO]   Remove RTP Session 10.0.0.4:5024
2024-03-06 06:53:45:042302 [NOTICE] Destroy TCP/MRCPv2 Connection 10.0.0.4:1544 <-> 10.0.0.4:44106
2024-03-06 06:53:45:042332 [NOTICE] Remove Session <a43daf0874ea42f6>
2024-03-06 06:53:45:042341 [INFO]   Session Terminated 0x7f99a8004838 <a43daf0874ea42f6>
2024-03-06 06:53:45:042384 [NOTICE] Destroy Session <a43daf0874ea42f6>

Please let us know the reason for this error. I can see that it is properly sending websocket request to port 1544.