-
Loading data from berlin dataset...
Error in readAudioFile(): Unknown file type!
Traceback (most recent call last):
File "emorecognition.py", line 41, in
db = Dataset(path,db_type,decode=…
-
For support and discussions, please use our [Discourse forums](https://discourse.mozilla.org/c/deep-speech).
If you've found a bug, or have a feature request, then please create an issue with the f…
-
I got a new GCP API key, and I tried to use it but I keep getting broken connection:
`Traceback (most recent call last):
File "main.py", line 13, in
speech=r.recognize_google(audio,key='…
-
type IsSpeaking
bool
type WhoIsSpeaking
uuid
known speakers
[chat on diarization embeddings](https://chatgpt.com/share/6704175b-9184-800f-bc01-2076a8af85bf)
[chat on running models locall…
-
How can word-level instead of phoneme-level speech recognition be done with the TIMIT dataset?
I build and train models. On the other hand, I have only phoneme transcription. I want word transcriptio…
-
I installed the repo without CLI on virtualized instances from Vast.ai with A100 40GB and 80 GB.
is_flash_attn_2_available() is False. Does it mean flash-attn is not used by inference. does it advers…
-
```
Obviously this is a non-trivial request/suggestion.
Something like the Google Mobile Voice recognition.
Workflow:
Tap home/Lift phone to ear
*audio prompt*
Say the name of the application
QG la…
-
Different recording hardware seems to make a big difference in the overall accuracy.
On an AMD Ryzen5 with a Logitech C925e as input device at ~75% loudness level, the accuracy of the word "carola…
-
#### Description
[ESP-EYE](https://www.espressif.com/en/products/devkits/esp-eye/overview) is a development board for image recognition and audio processing, which can be used in various AIoT appli…
-
# Task Name
Musical Style Transformation
## Task Objective
The goal of this task is to transform a given music piece from one musical genre to another, preserving the original melody and lyri…