-
Right now we are using (a subset) of Libri Lite which is a very big (60k hours) dataset of audiobooks read by thousands of speakers. It is pretty good but there is a lot of (probably more expressive a…
-
Hello,
I am using the below code to build a voice agent, most of the code has been gathered from different examples. I am facing the following problems:
1- interruption handling is bad compared to e…
-
I'd like three new optional fields added to the JSON video files to support subtitles/transcripts.
"original_machine_srt_subtitles": "a large string of srt-formatted subtitles from Whisper spee…
-
Just a handy issue to be notified of latest changes and micro-releases (we will mostly changing the models)
-
Hi,
At the specifications of the challenge it is written:
> EEE ICASSP 2023 Deep Noise Suppression (DNS) grand challenge is the 5th edition of Microsoft DNS challenges with focus on deep speech…
-
One of the few things the console versions still hold over TR1X, and since we've talked about it a couple of times over the past 2 years but have no issue yet, I thought I'd create one.
Released we…
-
We tried to apply VISQOL in the audio quality evaluation of a security camera device.
Here is our recording process:
Human voice -> Recorded by high-quality microphone (48kHz, 16bit, mono) -> Resam…
-
**Is your feature request related to a problem? Please describe.**
Yes, the current limitation of the Speech-Translate application is its inability to process OPUS audio files directly. Many online c…
-
**IN ORDER TO ASSIST YOU, PLEASE PROVIDE THE FOLLOWING:**
- Speech SDK log taken from a run that exhibits the reported issue.
See [instructions on how to take logs](https://docs.microsoft.com/azu…
-
# Task Name
Emotional Voice Conversion
## Task Objective
Emotional Voice Conversion is a task that aims to convert the emotional state from one to another while preserving the speech informa…