We're going to heavily rely on azure cognitive text-to-speech API to convert textual information to speech. Further we're passing generated speech audio (WAV) to an IVR service which says audio file must have following codec configuration:
To make sure we're aligned with this configuration we used:
Riff8Khz16BitMonoPcm | Raw8Khz16BitMonoPcm (file format error)
We can double-check generated speech audio file has configuration as same as required by an IVR service. When uploading generated speech audio to IVR, it's responding with an error however works fine any other wav file generated from audio tools. See the attached audio files.
To further more dive into this issue, we tried passing azure generated speech audio file to a tool - called 3Cx - to convert file to wav format again, this time IVR service accepts this converted file and working fine there.
We can see clearly after conversion, azure speech audio file's a few starting hex-codes of Raw Header are changed to E2 1B 03. How we can resolve this Azure Cognitive Services?
We're going to heavily rely on azure cognitive text-to-speech API to convert textual information to speech. Further we're passing generated speech audio (WAV) to an IVR service which says audio file must have following codec configuration:
To make sure we're aligned with this configuration we used:
We can double-check generated speech audio file has configuration as same as required by an IVR service. When uploading generated speech audio to IVR, it's responding with an error however works fine any other wav file generated from audio tools. See the attached audio files.
We used an online tool - (metadata2go) to see metadata of both file, but didn't able to catch such difference, here's metadata link for both files
To further more dive into this issue, we tried passing azure generated speech audio file to a tool - called 3Cx - to convert file to wav format again, this time IVR service accepts this converted file and working fine there.
Here's converted file (success.wav ---> converted_success.wav): Converted - Azure Speech File
With the help of metadata tool, we can see audio file Raw Header to see what happened after converting azure speech audio file.
Raw Header(success.wav)
Raw Header(converted_success.wav)
Raw Header (Audio Tool - 3thKisaan.wav)
We can see clearly after conversion, azure speech audio file's a few starting hex-codes of Raw Header are changed to
E2 1B 03
. How we can resolve this Azure Cognitive Services?