Closed ru4sam326 closed 1 month ago
Can u plz respond?
@ru4sam326 Thank you for using the JS Speech SDK, and writing this issue up. The Java Speech SDK includes an echo cancellation model that mitigates background noise, while the JS Speech SDK does not, which is why the discrepancy you've encountered exists. Implementation in the JS Speech SDK is TBD. A couple of options on your end:
any example to use java api for browser side chat. We are building a chatbot listening to realtime speech to do the analysis, we went with js which is causing the noise as you mentioned.
So now how to stream the browser stream to java side speechsdk ??
If you are implementing this chatbot on Windows, there's an "Acoustic Echo Cancellation" setting you can turn on to see if the noise for JS is mitigated, see attached picture:
For Java, if you can access the outgoing audio stream, presumably you can transform to a 16KHz 16-bit PCM stream and adapt the push stream code here to send it to the Java recognizer for recognition.
Hi Team,
Could you plz share some samples on sending the audio stream from browser to JAVA. Will be really helpful for us. Tried some but they are not working.
Approach tried:
JS Code:
async initRecognition() {
const stream = await navigator.mediaDevices.getUserMedia(
{audio:true}
);
const options: RecordRTC.Options = {};
options.type = "audio";
options.mimeType = "audio/wav";
options.timeSlice = 3000
options.recorderType = StereoAudioRecorder
options.numberOfAudioChannels = 1
options.desiredSampRate=16000
options.sampleRate=16000
options.bitrate=16
options.ondataavailable = async (blob:Blob)=> this.dataavailable(blob);
const recorder = new RecordRTCPromisesHandler(stream,options);
recorder.startRecording()
async dataavailable(blob: Blob) {
console.log('blob',blob)
if(this.socket.OPEN=== this.socket.readyState){
this.socket.send(blob)
}
JAVA Code:
public void handleBinaryMessage(WebSocketSession session, BinaryMessage message) throws Exception {
byte[] arr = new byte[message.getPayloadLength()];
message.getPayload().get(arr);
SpeechConfig speechConfig = SpeechConfig.fromSubscription("**********************", "******");
speechConfig.setSpeechRecognitionLanguage("en-IN");
System.out.println("Started before");
Semaphore stopRecognitionSemaphore = new Semaphore(0);
PushAudioInputStream pushStream = AudioInputStream.createPushStream();
System.out.println("Started after");
// Creates a speech recognizer using Push Stream as audio input.
AudioConfig audioInput = AudioConfig.fromStreamInput(pushStream);
SpeechRecognizer recognizer = new SpeechRecognizer(speechConfig, audioInput);
recognizer.recognized.addEventListener((s, e) -> {
if (e.getResult().getReason() == ResultReason.RecognizedSpeech) {
System.out.println("RECOGNIZED: Text=" + e.getResult().getText());
} else if (e.getResult().getReason() == ResultReason.NoMatch) {
System.out.println("NOMATCH: Speech could not be recognized.");
}
});
Thanks, Samba
Hi @glharper
Ignore the above one. Able to stream the audio from browser to backend still unable to cancel the Acoustic echo, Could you please suggest.
@ru4sam326 Since you're using the Java Speech SDK, this question is better asked in the native Speech SDK repo. This repo is specifically for the JavaScript Speech SDK.
Thanks @glharper. Raised https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/2381, for reference. In case someone follows.
What happened?
Hi Team,
I'm using JS SDK capturing the speech using SpeechSDK.AudioConfig.fromDefaultMicrophoneInput, If the teams/zoom call is going on through the desktop app, teams/zoom call other participants sounds coming from the speakers are coming through the above microphone input. I'm using 1.36.0 version.
Where as If i'm doing the same in JAVA with 1.37.0 version it is not capturing the Teams/zoom call other participants sounds coming from the speakers.
Please let me know how to resolve this in js.
Version
1.36.0 (Latest)
What browser/platform are you seeing the problem on?
No response
Relevant log output