microsoft / cognitive-services-speech-sdk-js

Microsoft Azure Cognitive Services Speech SDK for JavaScript
Other
267 stars 101 forks source link

[Bug]: Parsing WAV header #841

Open ed-sparkes opened 4 months ago

ed-sparkes commented 4 months ago

What happened?

FileAudioSource assumes the fmt chunk comes straight after the WAVE chunk when in fact this is not always the case, some wav files have a JUNK chunk between the WAVE and fmt.

The code here https://github.com/microsoft/cognitive-services-speech-sdk-js/blob/e89846a774639bfb05e9550d4d759871790cf627/src/common.browser/FileAudioSource.ts#L167

Could the code be updated to loop through the chunks until it finds the fmt chunk rather than assume its position.

Hex of my wav file below. Notice there is also a FLLR chunk before the data chunk

image

...

image

My file is a valid wav file generated using expo-av from device microphone but i cannot use it with SpeechRecognizer because the header parsing is failing

"Invalid WAV header in file, WAVEfmt was not found"

Version

1.36.0 (Latest)

What browser/platform are you seeing the problem on?

No response

Relevant log output

No response

ed-sparkes commented 4 months ago

I found a RIFF chunk parser https://github.com/rochars/riff-chunks

and ran it over my file

{
  "chunkId": "RIFF",
  "chunkSize": 81722,
  "format": "WAVE",
  "subChunks": [
    {
      "chunkId": "JUNK",
      "chunkSize": 28,
      "chunkData": {
        "start": 20,
        "end": 48
      }
    },
    {
      "chunkId": "fmt ",
      "chunkSize": 16,
      "chunkData": {
        "start": 56,
        "end": 72
      }
    },
    {
      "chunkId": "FLLR",
      "chunkSize": 4008,
      "chunkData": {
        "start": 80,
        "end": 4088
      }
    },
    {
      "chunkId": "data",
      "chunkSize": 77634,
      "chunkData": {
        "start": 4096,
        "end": 81730
      }
    }
  ]
}

perhaps a similar could be used to check for fmt chunk rather than assume its position