-
Train a model to recognise Morse code sequences from audio inputs for improved communication in emergencies. Collect diverse Morse code audio datasets, preprocess data, select appropriate architecture…
-
Thanks for this solid work. Have you released any preprocessed emotion recognition web dataset like Ravdess, Cream-D, or any data processing files so we can process the data ourselves? @knoriy @Yuchen…
-
**Summary:**
Currently, the project relies on YouTube’s captioning system for lyrics extraction. However, only a limited number of YouTube videos have captions enabled, restricting the number of song…
cmm25 updated
1 month ago
-
`torchaudio` is an extension library for PyTorch, designed to facilitate audio processing using the same PyTorch paradigms familiar to users of its tensor library. It provides powerful tools for audio…
-
The rebrowser patches is great and i am sure there is a lot difference now from any puppeteer versions. What i found was like once we open a browser and type something it works perfect but trying to a…
-
# Task Name
Text-Guided Speech In-Context Learning
## Task Objective
This task aims to utilize textual instructions to guide the interpretation of sequential audio clips, ultimately determini…
-
### Self Checks
- [X] This is only for bug report, if you would like to ask a question, please head to [Discussions](https://github.com/langgenius/dify/discussions/categories/general).
- [X] I have s…
-
Hi there, wonderful app you've made!
Would you consider making it an option to keep the audio recordings for each recognition event to be playable in the logged recognitions list?
It could be …
-
# Task Name
Japanese Pitch Accent Word Recognition
## Task Objective
This task aims to recognize words in Japanese audio that have different meanings based on pitch accent. Japanese pitch accent …
-
Hi, I am trying out the below code:
```
import speech_recognition as sr
# Obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
print("Speak now!")
au…