OmarKhled / Tafakor

Programmatically Creating Quran Clips
MIT License
9 stars 2 forks source link

Annotated reciters #1

Open OmarKhled opened 9 months ago

OmarKhled commented 9 months ago

The current limitation of the Tafakor Generator lies in the lack of annotated reciters. To effectively navigate through the captions in video, the audio files require annotation. This annotation means combining the mp3 files with data that indicates when each individual word begins and ends.

Currently, these annotated audio files are sourced from the quran.com API. However, the available list of annotated reciters is quite limited and short. To address this issue, we need to expand the list of reciters with annotated audio files.

This expansion can be achieved by leveraging ASR (Automatic Speech Recognition) speech recognition models. ASR technology will play a pivotal role in automatically annotating audio files from additional reciters.

Previous Work: Here are some sources of previous related work in this area:

For reference, also Whisper ASR notebook:

OmarKhled commented 9 months ago

https://github.com/tarekeldeeb/DeepSpeech-Quran

OmarKhled commented 7 months ago

Decided to proceed with Deepspeech-Quran repo and build on the mozilla model, the current trained model presented by Mr. Tarek el deep has a fair accuracy that manges to detect most of the words but arround 25% of the words aren't detected or are miss detected.

Inshaa-Allah I tend to train the model on the data of Islam Sobhy.