Ability to output time tags for each word when assessing audio of multiple languages + the Quran.

Salam, I ignored the preset message to ask questions on discourse.mozzilla because to my understanding that's not specific to this project, but rather the repo from which this project was forked.

I'm very new to speech recognition and have not really learned any ML (something I do plan on learning in the future). So my area of understanding doesn't really align with this project.

I'm looking for a tool that is able to timestamp a video based on the recitation of verses from the Quran. However, the videos I'm dealing with could contain two languages (English + Arabic). The idea would be to run a speech recognition tool that is able to do some timestamping based on when the speaker recites a verse of the Quran. I noticed that in one of the readme files it mentions the ability to "also output time tags for each word". I'm wondering if having a ~2hr video where verses of the Quran are only a small portion of the audio would be possible to timestamp... or does the audio have to be strictly Quran?

tarekeldeeb / DeepSpeech-Quran

Ability to output time tags for each word when assessing audio of multiple languages + the Quran. #24