microsoft / cognitive-services-speech-sdk-js

Microsoft Azure Cognitive Services Speech SDK for JavaScript
Other
267 stars 100 forks source link

[Bug]: Azure TTS service not working with JS SDK with NextJS and C# SDK with Unity #764

Closed emrahtoy closed 11 months ago

emrahtoy commented 11 months ago

What happened?

I could not get a result from the TTS service. Somehow our already working codes are also affected. Sometimes the request stales and some other times it gives a strange error that I couldn't find any detail about.

Here is what I tried;

  1. Updated JS SDK for nextJS app, C# SDK for unity app
  2. Created new speech service in another region
  3. Tried both configurations from subscription key and authorization token ( tokens are working )
  4. Downgrade SDKs and upgrade them step by step for each version.
  5. Regions used : westeurope and southcentralus
  6. Voices used : tr-TR-EmelNeural and en-US-JennyNeural
  7. Languages used : tr-TR and en-US
  8. S0 (not a Free service)

Env :

  1. Node v18.15.0
  2. NextJS 13.3.2
  3. React 18.2.0
  4. Wsl2 Ubuntu 20.04
  5. Yarn 3.5.1

Here is my code :

import {
  AudioConfig,
  Diagnostics,
  LogLevel,
  SpeechConfig,
  SpeechSynthesisOutputFormat,
  SpeechSynthesisVisemeEventArgs,
  SpeechSynthesizer
} from 'microsoft-cognitiveservices-speech-sdk'
import { NextApiRequest, NextApiResponse } from 'next/types'

const correspodings: string[] = [
  'sil',
  'O', //"oh"
  'aa',
  'U', //"ou"
  'E', //"e"
  'E', //"e"
  'I', //"ih"
  'U', //"ou"
  'O', //"oh"
  'E', //"e"
  'O', //"oh"
  'aa',
  'E',
  'R',
  'nn',
  'SS',
  'kk',
  'TH',
  'FF',
  'DD',
  'kk',
  'PP'
]

export default async function handler(req: NextApiRequest, res: NextApiResponse) {
  if (req.method == 'POST') {
    res.setHeader('Content-Type', 'application/json')

    Diagnostics.SetLoggingLevel(LogLevel.Debug)
    Diagnostics.SetLogOutputPath('LogfilePathAndName')

    //key and region
    const speechKey = process.env.SPEECH_KEY
    const speechRegion = process.env.SPEECH_REGION

    const audioConfig: AudioConfig | undefined = undefined

    if (speechKey === 'paste-your-speech-key-here' || speechRegion === 'paste-your-speech-region-here') {
      res.status(400).json({ error: 'You forgot to add your speech key or region to the .env file.' })
    } else {
      return new Promise<any>(async (resolve, reject) => {
        const speechConfig = SpeechConfig.fromSubscription(speechKey!, speechRegion!) //speechRegion!.toString()}

        speechConfig.speechSynthesisLanguage = req.body.language?.toString() || 'tr-TR'
        speechConfig.speechSynthesisVoiceName = req.body.voice?.toString() || 'tr-TR-EmelNeural'
        speechConfig.speechSynthesisOutputFormat = SpeechSynthesisOutputFormat.Audio16Khz64KBitRateMonoMp3

        const synthesizer = new SpeechSynthesizer(speechConfig, audioConfig)

        const visemes: { value: string; audioOffset: number }[] = []

        synthesizer.visemeReceived = (o: SpeechSynthesizer, e: SpeechSynthesisVisemeEventArgs): void => {
          visemes.push({ value: correspodings[e.visemeId], audioOffset: e.audioOffset })
        }

        const text = req.body?.text?.toString() // doesn't matter what ever you set

        synthesizer.synthesisStarted = (s, e) => {
          console.log('started', e)
        }
        synthesizer.synthesizing = (s, e) => {
          console.log('ing', e)
        }
        synthesizer.SynthesisCanceled = (s, e) => {
          console.log('cancel', e)
        }
        synthesizer.synthesisCompleted = (s, e) => {
          console.log('completed', e)
        }

        synthesizer.speakTextAsync(
          'Merhaba dünya',
          result => {
            console.log(result)
            synthesizer.close()
            if (result.errorDetails) {
              reject({
                mouthCues: [],
                audio: null,
                error: result.errorDetails
              })
            } else {
              const audioBuff = Buffer.from(result.audioData)
              resolve({
                mouthCues: visemes,
                audio: audioBuff.toString('base64'),
                error: { details: result.errorDetails, reason: result.reason }
              })
            }
          },
          error => {
            console.log(error)
            synthesizer.close()
            reject({
              mouthCues: [],
              audio: null,
              error: error
            })
          }
        )
      })
    }
  } else {
    res.status(405).end()
  }
}

How to reproduce :

  1. Create nextjs 13 app with pages directory
  2. Put the code above into pages/api directory ( works as an endpoint )
  3. Set the environment variables
  4. yarn dev or npm run dev
  5. make a request to the endpoint

I have seen a bug about new line format ( "\n", "\r\n" ) before, could this be the case? But then why my unity projects ( that has been working for long time on cloud ) also stopped working? It does not matter if the project in development or production with or without {swcMinify: false} problem always exist ( for nextjs apps ).

By the way STT ( speech to text, speech recognition is working ).

I would like someone to help me dig deeper until the problem got fixed, thank you.

Version

1.33.0 (Latest)

What browser/platform are you seeing the problem on?

Node

Relevant log output

Stale Logs : 

2023-11-19T10:56:29.568Z | SynthesisTriggeredEvent | privName: SynthesisTriggeredEvent | privEventId: 38C3FD1B0E0B40D4BF2FBF074381107E | privEventTime: 2023-11-19T10:56:29.568Z | privEventType: 1 | privMetadata: {} | privRequestId: 0A028714E64F4D8CBEE0D8E62EB21873 | privSessionAudioDestinationId: <NULL> | privTurnAudioDestinationId: <NULL>
ConsoleLoggingListener.js:34
2023-11-19T10:56:29.569Z | ConnectingToSynthesisServiceEvent | privName: ConnectingToSynthesisServiceEvent | privEventId: 307B9A5458624247B8CE3BB49C51EA03 | privEventTime: 2023-11-19T10:56:29.569Z | privEventType: 1 | privMetadata: {} | privRequestId: 0A028714E64F4D8CBEE0D8E62EB21873 | privAuthFetchEventId: 11BC41234E5F4815A1CF6CCAB291F28C
ConsoleLoggingListener.js:34
2023-11-19T10:56:29.576Z | ConnectionStartEvent | privName: ConnectionStartEvent | privEventId: E5E237A79FC847F7A094B006AF8809CF | privEventTime: 2023-11-19T10:56:29.576Z | privEventType: 1 | privMetadata: {} | privConnectionId: C8EF7F0D7A884C859DB8450B1D8C720D | privUri: wss://southcentralus.tts.speech.microsoft.com/cognitiveservices/websocket/v1?Ocp-Apim-Subscription-Key=subscriptionkeyfiltered&X-ConnectionId=C8EF7F0D7A884C859DB8450B1D8C720D | privHeaders: <NULL>
ConsoleLoggingListener.js:34
2023-11-19T10:56:30.215Z | ConnectionEstablishedEvent | privName: ConnectionEstablishedEvent | privEventId: C02C3F705C634C7CBD236CFBBC25666E | privEventTime: 2023-11-19T10:56:30.215Z | privEventType: 1 | privMetadata: {} | privConnectionId: C8EF7F0D7A884C859DB8450B1D8C720D
ConsoleLoggingListener.js:34
2023-11-19T10:56:30.215Z | SynthesisStartedEvent | privName: SynthesisStartedEvent | privEventId: 4B2671E704F348EC9788BEEA33BA98DF | privEventTime: 2023-11-19T10:56:30.215Z | privEventType: 1 | privMetadata: {} | privRequestId: 0A028714E64F4D8CBEE0D8E62EB21873 | privAuthFetchEventId: 11BC41234E5F4815A1CF6CCAB291F28C
ConsoleLoggingListener.js:34
2023-11-19T10:56:30.216Z | ConnectionMessageSentEvent | privName: ConnectionMessageSentEvent | privEventId: 0F823895935E46D08D6D00E0C053DABD | privEventTime: 2023-11-19T10:56:30.216Z | privEventType: 1 | privMetadata: {} | privConnectionId: C8EF7F0D7A884C859DB8450B1D8C720D | privNetworkSentTime: 2023-11-19T10:56:30.216Z | privMessage: {"privBody":"{\"context\":{\"system\":{\"name\":\"SpeechSDK\",\"version\":\"1.33.1\",\"build\":\"JavaScript\",\"lang\":\"JavaScript\"},\"os\":{\"platform\":\"Node\",\"name\":\"unknown\",\"version\":\"unknown\"}}}","privMessageType":0,"privHeaders":{"Path":"speech.config","X-RequestId":"0A028714E64F4D8CBEE0D8E62EB21873","X-Timestamp":"2023-11-19T10:56:30.216Z","Content-Type":"application/json"},"privId":"601BDDC65A334678A9CE9356935C9F92","privSize":165,"privPath":"speech.config","privRequestId":"0A028714E64F4D8CBEE0D8E62EB21873","privContentType":"application/json"}
ConsoleLoggingListener.js:34
2023-11-19T10:56:30.217Z | ConnectionMessageSentEvent | privName: ConnectionMessageSentEvent | privEventId: 931315B16B5241B782DFB3D4A40DB232 | privEventTime: 2023-11-19T10:56:30.217Z | privEventType: 1 | privMetadata: {} | privConnectionId: C8EF7F0D7A884C859DB8450B1D8C720D | privNetworkSentTime: 2023-11-19T10:56:30.217Z | privMessage: {"privBody":"{\"synthesis\":{\"audio\":{\"metadataOptions\":{\"bookmarkEnabled\":false,\"punctuationBoundaryEnabled\":\"false\",\"sentenceBoundaryEnabled\":\"false\",\"sessionEndEnabled\":true,\"visemeEnabled\":true,\"wordBoundaryEnabled\":\"false\"},\"outputFormat\":\"audio-16khz-64kbitrate-mono-mp3\"},\"language\":{\"autoDetection\":false}}}","privMessageType":0,"privHeaders":{"Path":"synthesis.context","X-RequestId":"0A028714E64F4D8CBEE0D8E62EB21873","X-Timestamp":"2023-11-19T10:56:30.217Z","Content-Type":"application/json"},"privId":"30C8FBE4B9094EC8B84FE56CF5483190","privSize":300,"privPath":"synthesis.context","privRequestId":"0A028714E64F4D8CBEE0D8E62EB21873","privContentType":"application/json"}
ConsoleLoggingListener.js:34
2023-11-19T10:56:30.218Z | ConnectionMessageSentEvent | privName: ConnectionMessageSentEvent | privEventId: 132FFEB17C3C4B0480CEC9A714BE5DE8 | privEventTime: 2023-11-19T10:56:30.218Z | privEventType: 1 | privMetadata: {} | privConnectionId: C8EF7F0D7A884C859DB8450B1D8C720D | privNetworkSentTime: 2023-11-19T10:56:30.218Z | privMessage: {"privBody":"<speak version='1.0' xmlns='http://www.w3.org/2001/10/synthesis' xmlns:mstts='http://www.w3.org/2001/mstts' xmlns:emo='http://www.w3.org/2009/10/emotionml' xml:lang='tr-TR'><voice name='tr-TR-EmelNeural'>Merhaba dünya</voice></speak>","privMessageType":0,"privHeaders":{"Path":"ssml","X-RequestId":"0A028714E64F4D8CBEE0D8E62EB21873","X-Timestamp":"2023-11-19T10:56:30.217Z","Content-Type":"application/ssml+xml"},"privId":"CD552B62918B446CA91C5F30DD5AA2D5","privSize":233,"privPath":"ssml","privRequestId":"0A028714E64F4D8CBEE0D8E62EB21873","privContentType":"application/ssml+xml"}
ConsoleLoggingListener.js:34
started SpeechSynthesisEventArgs {privResult: SpeechSynthesisResult}

Error logs ( when given ) :

2023-11-19T10:57:49.954Z | SynthesisTriggeredEvent | privName: SynthesisTriggeredEvent | privEventId: 5B05463A4A7A4AF79A309097729228A2 | privEventTime: 2023-11-19T10:57:49.954Z | privEventType: 1 | privMetadata: {} | privRequestId: 437A3F45D95E4CB8AA9552C93B9661EF | privSessionAudioDestinationId: <NULL> | privTurnAudioDestinationId: <NULL>
ConsoleLoggingListener.js:34
2023-11-19T10:57:49.955Z | ConnectingToSynthesisServiceEvent | privName: ConnectingToSynthesisServiceEvent | privEventId: 31D561D5BE424372AEF92FD41A8BD1E5 | privEventTime: 2023-11-19T10:57:49.955Z | privEventType: 1 | privMetadata: {} | privRequestId: 437A3F45D95E4CB8AA9552C93B9661EF | privAuthFetchEventId: BC42782A02254F6985CE35545C4E314E
ConsoleLoggingListener.js:34
2023-11-19T10:57:49.959Z | ConnectionStartEvent | privName: ConnectionStartEvent | privEventId: 7A22BE993AE94CC298C757810A246CA3 | privEventTime: 2023-11-19T10:57:49.959Z | privEventType: 1 | privMetadata: {} | privConnectionId: D97480B6808C4B9B9F68A0C3EBFC6DB4 | privUri: wss://southcentralus.tts.speech.microsoft.com/cognitiveservices/websocket/v1?Ocp-Apim-Subscription-Key=subscriptionkeyfiltered&X-ConnectionId=D97480B6808C4B9B9F68A0C3EBFC6DB4 | privHeaders: <NULL>
ConsoleLoggingListener.js:34
2023-11-19T10:57:50.786Z | ConnectionEstablishedEvent | privName: ConnectionEstablishedEvent | privEventId: E75E57146ED2427DA02148F39A443665 | privEventTime: 2023-11-19T10:57:50.786Z | privEventType: 1 | privMetadata: {} | privConnectionId: D97480B6808C4B9B9F68A0C3EBFC6DB4
ConsoleLoggingListener.js:34
2023-11-19T10:57:50.787Z | SynthesisStartedEvent | privName: SynthesisStartedEvent | privEventId: F273B9FC944E46809A0427CDF164D303 | privEventTime: 2023-11-19T10:57:50.787Z | privEventType: 1 | privMetadata: {} | privRequestId: 437A3F45D95E4CB8AA9552C93B9661EF | privAuthFetchEventId: BC42782A02254F6985CE35545C4E314E
ConsoleLoggingListener.js:34
2023-11-19T10:57:50.787Z | ConnectionMessageSentEvent | privName: ConnectionMessageSentEvent | privEventId: DF87AAF604D7434AADC9D9033016BDC7 | privEventTime: 2023-11-19T10:57:50.787Z | privEventType: 1 | privMetadata: {} | privConnectionId: D97480B6808C4B9B9F68A0C3EBFC6DB4 | privNetworkSentTime: 2023-11-19T10:57:50.787Z | privMessage: {"privBody":"{\"context\":{\"system\":{\"name\":\"SpeechSDK\",\"version\":\"1.33.1\",\"build\":\"JavaScript\",\"lang\":\"JavaScript\"},\"os\":{\"platform\":\"Node\",\"name\":\"unknown\",\"version\":\"unknown\"}}}","privMessageType":0,"privHeaders":{"Path":"speech.config","X-RequestId":"437A3F45D95E4CB8AA9552C93B9661EF","X-Timestamp":"2023-11-19T10:57:50.787Z","Content-Type":"application/json"},"privId":"E7FFB26BF6CF4BEAAF6FA8EE49972E71","privSize":165,"privPath":"speech.config","privRequestId":"437A3F45D95E4CB8AA9552C93B9661EF","privContentType":"application/json"}
ConsoleLoggingListener.js:34
2023-11-19T10:57:50.789Z | ConnectionMessageSentEvent | privName: ConnectionMessageSentEvent | privEventId: 1B86370C5B144F8DB16D9E1117A79690 | privEventTime: 2023-11-19T10:57:50.789Z | privEventType: 1 | privMetadata: {} | privConnectionId: D97480B6808C4B9B9F68A0C3EBFC6DB4 | privNetworkSentTime: 2023-11-19T10:57:50.788Z | privMessage: {"privBody":"{\"synthesis\":{\"audio\":{\"metadataOptions\":{\"bookmarkEnabled\":false,\"punctuationBoundaryEnabled\":\"false\",\"sentenceBoundaryEnabled\":\"false\",\"sessionEndEnabled\":true,\"visemeEnabled\":true,\"wordBoundaryEnabled\":\"false\"},\"outputFormat\":\"audio-16khz-64kbitrate-mono-mp3\"},\"language\":{\"autoDetection\":false}}}","privMessageType":0,"privHeaders":{"Path":"synthesis.context","X-RequestId":"437A3F45D95E4CB8AA9552C93B9661EF","X-Timestamp":"2023-11-19T10:57:50.788Z","Content-Type":"application/json"},"privId":"EAD0A07980904936998D2F4F9AC46B19","privSize":300,"privPath":"synthesis.context","privRequestId":"437A3F45D95E4CB8AA9552C93B9661EF","privContentType":"application/json"}
ConsoleLoggingListener.js:34
2023-11-19T10:57:50.790Z | ConnectionMessageSentEvent | privName: ConnectionMessageSentEvent | privEventId: 00EC4461A955411BB3FC77AA3D083EE2 | privEventTime: 2023-11-19T10:57:50.790Z | privEventType: 1 | privMetadata: {} | privConnectionId: D97480B6808C4B9B9F68A0C3EBFC6DB4 | privNetworkSentTime: 2023-11-19T10:57:50.789Z | privMessage: {"privBody":"<speak version='1.0' xmlns='http://www.w3.org/2001/10/synthesis' xmlns:mstts='http://www.w3.org/2001/mstts' xmlns:emo='http://www.w3.org/2009/10/emotionml' xml:lang='tr-TR'><voice name='tr-TR-EmelNeural'>Merhaba dünya</voice></speak>","privMessageType":0,"privHeaders":{"Path":"ssml","X-RequestId":"437A3F45D95E4CB8AA9552C93B9661EF","X-Timestamp":"2023-11-19T10:57:50.789Z","Content-Type":"application/ssml+xml"},"privId":"B0323EAE2A1E4856A703303F94E0FCA9","privSize":233,"privPath":"ssml","privRequestId":"437A3F45D95E4CB8AA9552C93B9661EF","privContentType":"application/ssml+xml"}
ConsoleLoggingListener.js:34
started SpeechSynthesisEventArgs {privResult: SpeechSynthesisResult}
tts1.ts:73
cancel SpeechSynthesisEventArgs {privResult: SpeechSynthesisResult}
tts1.ts:79
arg1:
SpeechSynthesisEventArgs {privResult: SpeechSynthesisResult}
privResult:
SpeechSynthesisResult {privResultId: '437A3F45D95E4CB8AA9552C93B9661EF', privReason: 1, privErrorDetails: 'Invalid message format. Text message contains no data websocket error code: 1007', privProperties: PropertyCollection, privAudioData: undefined, …}
result:
ƒ result() {\n        return this.privResult;\n    }
[[Prototype]]:
Object
SpeechSynthesisResult {privResultId: '437A3F45D95E4CB8AA9552C93B9661EF', privReason: 1, privErrorDetails: 'Invalid message format. Text message contains no data websocket error code: 1007', privProperties: PropertyCollection, privAudioData: undefined, …}
tts1.ts:88
2023-11-19T10:57:51.965Z | ConnectionClosedEvent | privName: ConnectionClosedEvent | privEventId: 5054476ECAF74CE7BC17DC7E22801C45 | privEventTime: 2023-11-19T10:57:51.965Z | privEventType: 0 | privMetadata: {} | privConnectionId: D97480B6808C4B9B9F68A0C3EBFC6DB4 | privReason: Invalid message format. Text message contains no data | privStatusCode: 1007
ConsoleLoggingListener.js:30
error - Error: {"mouthCues":[],"audio":null,"error":"Invalid message format. Text message contains no data websocket error code: 1007"}
    at getProperError (/home/emrahtoy/dev/backoffice/node_modules/next/dist/lib/is-error.js:41:12)
    at DevServer.run (/home/emrahtoy/dev/backoffice/node_modules/next/dist/server/dev/next-dev-server.js:924:53)
    at process.processTicksAndRejections (/home/emrahtoy/dev/backoffice/lib/internal/process/task_queues.js:95:5)
    at async DevServer.handleRequestImpl (/home/emrahtoy/dev/backoffice/node_modules/next/dist/server/base-server.js:533:20) {name: 'Error', digest: undefined, stack: 'Error: {"mouthCues":[],"audio":null,"error":"…dules/next/dist/server/base-server.js:533:20)', message: '{"mouthCues":[],"audio":null,"error":"Invali…ontains no data websocket error code: 1007"}', Symbol(NextjsError): 'server'}

I also get this error sometimes :

error - uncaughtException: ArgumentNull: payload
    at new RawWebsocketMessage (file:///home/emrahtoy/dev/backoffice/node_modules/microsoft-cognitiveservices-speech-sdk/distrib/lib/src/common/RawWebsocketMessage.js:11:19)
    at privWebsocketClient.onmessage (file:///home/emrahtoy/dev/backoffice/node_modules/microsoft-cognitiveservices-speech-sdk/distrib/lib/src/common.browser/WebsocketMessageAdapter.js:137:40)
    at [nodejs.internal.kHybridDispatch] (/home/emrahtoy/dev/backoffice/lib/internal/event_target.js:735:20)
    at WebSocket.dispatchEvent (/home/emrahtoy/dev/backoffice/lib/internal/event_target.js:677:26)
    at fireEvent (/home/emrahtoy/dev/backoffice/node_modules/next/dist/compiled/undici/index.js:3:4860)
    at websocketMessageReceived (/home/emrahtoy/dev/backoffice/node_modules/next/dist/compiled/undici/index.js:3:5188)
    at ByteParser.run (/home/emrahtoy/dev/backoffice/node_modules/next/dist/compiled/undici/index.js:3:3261)
    at ByteParser._write (/home/emrahtoy/dev/backoffice/node_modules/next/dist/compiled/undici/index.js:3:967)
    at writeOrBuffer (/home/emrahtoy/dev/backoffice/lib/internal/streams/writable.js:392:12)
    at _write (/home/emrahtoy/dev/backoffice/lib/internal/streams/writable.js:333:10)
    at Writable.write (/home/emrahtoy/dev/backoffice/lib/internal/streams/writable.js:337:10)
    at TLSSocket.onSocketData (/home/emrahtoy/dev/backoffice/node_modules/next/dist/compiled/undici/index.js:2:272570)
    at TLSSocket.emit (/home/emrahtoy/dev/backoffice/lib/events.js:513:28)
    at addChunk (/home/emrahtoy/dev/backoffice/lib/internal/streams/readable.js:324:12)
    at readableAddChunk (/home/emrahtoy/dev/backoffice/lib/internal/streams/readable.js:297:9)
    at Readable.push (/home/emrahtoy/dev/backoffice/lib/internal/streams/readable.js:234:10)
    at TLSWrap.onStreamRead (/home/emrahtoy/dev/backoffice/lib/internal/stream_base_commons.js:190:23)
    at TLSWrap.callbackTrampoline (node:internal/async_hooks:130:17) {name: 'ArgumentNull', digest: undefined, stack: 'ArgumentNull: payload
    at new RawWebsocket…Trampoline (node:internal/async_hooks:130:17)', message: 'payload', Symbol(NextjsError): 'server'}
yulin-li commented 11 months ago

Hi, I tried to repro your issue but seems your code works form. Here are the steps I tried

And curl -X POST http://localhost:3000/api -d '{}'

my environment:

I am not familiar with next js, could you try to find the difference between your environment and mine?

emrahtoy commented 11 months ago

Hello @yulin-li ;

Only significant difference is the nextJS's version. I will try this version despite the possible incompatibility with the rest of the project but I should mention that our Unity projects ( on production ) also stopped working.

In the other hand I have seen that some of the endpoint addresses changed in our azure account as shown in the screenshot below; ( these are not changes that we made )

image

I will share results with the different version of the NextJS.

yulin-li commented 11 months ago

You said the Unity projects were broken? Is that mean the latest speck sdk (1.33.1) has regression? Does 1.32 or earlier work for your Unity projects?

emrahtoy commented 11 months ago

You said the Unity projects were broken? Is that mean the latest speck sdk (1.33.1) has regression? Does 1.32 or earlier work for your Unity projects?

No, I mean our production projects ( which are already in service ) also stop working. But right now they start working... I really don't understand why. So we can forget about the unity and C# problem

I have updated NextJS 13 ( latest ) and NodeJS 18LTS ( latest ), unfortunately, nothing changed.

emrahtoy commented 11 months ago

Hi, I tried to repro your issue but seems your code works form. Here are the steps I tried

  • npx create-next-app@latest my-nextjs-app
  • copy your codes into pages/api/index.ts
  • npm install microsoft-cognitiveservices-speech-sdk
  • npm run dev

And curl -X POST http://localhost:3000/api -d '{}'

my environment:

  • Ubuntu 22.04
  • Nodejs v18.18.0
  • "microsoft-cognitiveservices-speech-sdk": "^1.33.1", "next": "14.0.3", "react": "^18",

I am not familiar with next js, could you try to find the difference between your environment and mine?

I have got the same results with the new nextJS ( 14 ). It is obvious that the problem is related to the NextJS version. I have tried the same setup with the NextJS 13 and it is not working at all.

yulin-li commented 11 months ago

ok, from the service we can see your failure request sent an empty websocket message, which was rejected by our service.

Please let me know if you need any help from our side.

emrahtoy commented 11 months ago

Unfortunately, nothing worked with the NextJS 13. So I had to upgrade to NodeJS14 and I am not getting any error right now. İ couldn't have enough time to dig deeper for now. So I closed the issue and will investigate later. Thank you for testing NextJS14 @yulin-li .