Open Abdullah076 opened 7 years ago
Maybe we can have an import or add Qari feature/API so reciters can format their content on a site and provide us with a link to the XML file that has all the meta data needed to import / download his files
thats a good idea i think
we are working on some scripts to try to generate timing files automatically for gapless qaris - my hope is to get rid of the gapped qaris in the future insha'Allah and use gapless instead.
Can you explain or link to the scripts? Maybe I have a simpler solution.
@Abdullah076: i can probably make an exception for sheikh Ayman and add him even before we figure out how to make everything gapless, just because masha'Allah he's well known for teaching people Quran pronunciation, etc. the rest let's wait.
@nyitgroup
if you think about ayah by ayah, the way that works is people took the full recitation mp3 (something from quranicaudio.com for example) and marked every position of the start/end of every ayah (there is some software that helps for this, but for Windows mostly). then, the files are split according to that. alternatively, they are left as is and can be played as gapless.
this takes a lot of time and effort to do for a person, and people always want other qaris. my experiments were around using machine learning to try to make this happen automatically using the open source cmusphinx libraries. making some progress, but the accuracy is still too low unfortunately.
Any update about Ayman Suwaid's recitation brother @ahmedre ? JazakAllahu khairan
AlSalam alaykum all,
Thank you for this wonderful application!
I have two reciters that I suggest to add to this application:
Upon reading the code and as stated above, timing data of each verse is needed.
I can contribute to it. As this can take much time, I would ensure that:
Programmers, any feedback or notes that can put me on track is highly appreciated.
jazakom Allah khairan
Assalamu 'alaykum,
I know this is an old issue, and can open a new one if appropriate. Worked on generating timing files for gapless playback for Mishary Al Afasy for mp3s on https://quranicaudio.com/. The timing files are in the style of files found on http://www.everyayah.com/data/timings_files/ The files can be found here: https://github.com/anassaiyed/quran_timing_files/tree/timing_zips. I can also create a sqlite database file for this.
Used these scripts for generating the files: https://github.com/anassaiyed/quran_timing_files The script can also work for generating timing files for other reciters provided they have gapped ayah files and corresponding gapless surah files.
I was wondering if gapless playback for Mishary Rashid Al Afasy can be included in the next release of the app. I can send a pull request with updates to readers.xml if you plan on integrating this in the app.
jazakAllah khairan.
wa3laikum alsalam wa ra7matullahi wa barakatuh, jazakAllah khairan - masha'Allah, this is awesome! several people have been working on trying to write something to detect the ayahs within a gapless sura (but with not-so-ideal accuracy), so really excited to see your script's output for sheikh Mishari - if the accuracy is good, happy to release it in the next version in sha' Allah, and if the script works well for other qaris, it would be amazing for many projects!
may Allah reward you greatly - will try this and let you know in sha' Allah! walsalam 3alaikum.
jazakAllah khairan brother Ahmed for the quick reply. Initially I did not verify the results in detail. Later, I went through the first 10 surahs ayah by ayah and found 15-20 ayahs in Surah Al Baqarah and Surah Al Araf (the largest 2 surahs) that were off by a few seconds. The remaining surahs had 1-2 ayahs that were also off by a few seconds.
I am working on a solution to improve the accuracy. I will in sha' Allahtry to generate new timing files and inshallah verify their accuracy. Plan to post back here in a couple of days in sha' Allah. Assalamu 'alaykum
jazakAllah khairan, looking forward to it. today, i tried the existing zip file (and wrote a script to convert the text files into Quran for Android's database format). i discovered, though, that on some suras, the timings are not increasing in value for each ayah (see for example 078.txt in the zip file for sheikh Mishari).
I updated the timing files after modifying script. The main trick was to use only the first 5 seconds of the ayah mp3 to identify the ayah's location in the surah. The other thing that I did was to only search a 800ms window around the detected location for the quietest portion.
I listened to the surahs and there are only a few errors (I think 1 error for every 2-3 longer surah and almost no errors in the shorter surahs). For the errors, the ayah timings are off by a few hundred milliseconds to about 2-3 seconds (I think I only found 2-3 and I think I corrected most of these). I will update and clean up the code and update the repository. Will post here when that is done.
For surahs 078, 081 and 082, the ayah recordings are different from the surah recordings. Hence the scripts were not useful use for them. I manually creted the timing files for these.
Anyway, here is the link to the updated zip. Let me know once you try it. https://github.com/anassaiyed/quran_timing_files/blob/timing_zips/mishary_al_afasy_timing_files.zip
Any updates brother @ahmedre?
masha'Allah brother, this is excellent - i've tried on suras 78 after your fixes and sura 79 along with some other suras and it seems spot on. will go ahead and make sure sheikh Mishari is gapless in the next release.
if i may ask:
jazakAllah khairan, really appreciate this effort, may Allah reward you greatly!
also, just to be sure i did this right (i did test so it seems right) - but i made the first line in your file correspond to ayah 1, the second line to ayah 2, and so on, while dropping the last line.
jazakAllah khairan, I am looking forward to the next release for the gapless playback.
I updated the scripts: https://github.com/anassaiyed/quran_timing_files
2. do the set of ayahs passed in to the script have to be for the same sheikh or can i pass in ayahs from sheikh to try to get gapless data for a different sheikh?
- No they have to be from the same sheikh (The scripts are not smart :) ). So the fingerprinting has to be done for each sheikh you want to generate the timing files for. Then, the timing files can be generated for the sheikhs whose shuyookh files have been fingerprinted.
Regarding the lines in the timing files, you are correct that the first line corresponds to ayah 1, second line to ayah 2 and so on. The last line corresponds to the end of the surah (999 in the sqlite database I think). There is a newline character at the end of the timing files. So because of the newline character can be dropped.
Let me know if you want help generating timing files for other shuyookh.
Also, for generating timing files, the gapless surah files and the ayah files have to be from the same recording. If the recordings are different, the scripts will not generate correct timing data.
jazakAllah khairan - awesome! i think some of the other gapped recordings (ex sheikh Maher) have similarities to the gapless ones, so can try this on them. if you have suggestions as to which to try, i'd be happy to try running the script also. really appreciate this, may Allah reward you greatly for it!
Sheikh Maher's recordings sound similar but aren't the same, so the scripts didn't work. I tested a couple of surahs on some other shuyookh and I found matching recordings for them which worked with the scripts. You can try running the scripts for these.
Ayahs: http://www.everyayah.com/data/Hudhaify_128kbps/ Surahs: https://quranicaudio.com/quran/8
Ayahs: http://www.everyayah.com/data/Hani_Rifai_192kbps/ Surahs: https://quranicaudio.com/quran/27
Ayahs: http://www.everyayah.com/data/Husary_128kbps_Mujawwad/ Surahs: http://www.assabile.com/mahmoud-khalil-al-hussary-27/collection/al-mushaf-al-mujawwad-244
Sheikh Hussary's recordings are very long. So I would recommend running the scripts for the other 2 shuyookh first and see how it works.
jazakAllah khairan! will try this and report back later this week in sha' Allah.
salam 3alaikum, just wanted to give an update. i ran the script on sheikh Hudhayfi and sheikh Hani Rifai. I also discovered that sheikh Rifai already had some timing files (marked as beta) and so did sheikh Hudhayfi.
so both went pretty well overall masha'Allah.
sheikh Rifai had a total of 9 errors, which i pulled the values for from the beta timings. i tested all 9 of these and they were fine, and did random testing in other places and everything seemed good. i did run across one ayah (not one of the errors) where the timing was ~1 second off in one of the sura. i did a bunch of other random checks in random suras and everything seemed okay.
for sheikh Hudhayfi, i had a lot more errors - 153 - mostly concentrated in a handful of suras in the 29th and 30th juz's. i tried replacing these with the timing files for them, but the timings for these weren't that good (usually several seconds off - perhaps it was from a different source for those suras, they don't specify the source for them unfortunately) - anyway the ones that weren't errors were generally good - i did some random testing and everything seemed okay. in sha' Allah i'll re-do most of the suras with errors (only a handful of them and they're short) using your manual script.
do you have any easy way to figure out what is a good candidate for being off so we can just check those?
jazakAllah khairan. walsalam 3alaikum.
cc @anassaiyed oh, one more thing - some of the suras that had errors for sheikh Hudhayfi gave me a total of 3 lines in the file (ex sura Qiyyamah, which has 40 ayahs, came back with 3 lines total in the text file). i only saw this for sheikh Hudhayfi.
will in sha' Allah try Hussary mujawwad.
wa3laikum alsalam, jazakAllah khairan! May Allah reward you for all this work. Unfortunately, I do not have any way to accurately detect if ayahs are off by a few seconds other than manually listening. There is a play.py (needs vlc and vlc python bindings installed) file in the manual folder in the repository that can help with that, but still takes time.
For the files that have only a few lines instead of 40, I suspect, that the ayah files and the surah files are from different recordings. So dejavu might have returned a null (need to add error handling for this scenario) and the sub process probably terminated at that point. This was the case for surahs 078, 081 and 082 for sheikh Mishary. If you can find the correct matching recordings for those ayah files, then the scripts would work. Otherwise timing files would have to be generated manually for them.
I will try to find matching recordings for the other gapped shuyookh in the app. The ayah mp3s must have been split from a whole recording originally, but I was not able to find the matching recordings for sheikh Maher when I originally looked.
Here are a few other shuyookh:
http://www.everyayah.com/data/Muhammad_Ayyoub_128kbps/ https://archive.org/details/Muhammad-Ayyub/110.mp3
http://www.everyayah.com/data/Muhammad_Jibreel_128kbps/ https://quranicaudio.com/quran/12
http://www.everyayah.com/data/Abdullah_Basfar_192kbps/ https://archive.org/details/AbdullahAliBasfar/001.mp3
http://www.everyayah.com/data/Minshawy_Mujawwad_192kbps/ https://quranicaudio.com/quran/41
http://www.everyayah.com/data/English/Sahih_Intnl_Ibrahim_Walk_192kbps/ http://audio-quran-eng.blogspot.com/2014/04/the-holy-quran-read-by-ibrahim-walk.html
https://download.quranicaudio.com/quran/mishaari_w_ibrahim_walk_si/ http://www.everyayah.com/data/Alafasy_128kbps/
I was not able to find matching files for sheikh Maher and Dr Ayman Suwaid. I think I listed matching files for all the other gapped qaris in the app.
jazakAllah khairan - will continue trying these. unfortunately, sheikh Hussary Mujawwad didn't work well (1578 lines written in the timing files total). i'll try some of the other shuyookh you mentioned above and see if those give good timings in sha' Allah.
but at least so far this gives us 3 successful things - sheikh Mishari that you did (jazakAllah khairan), sheikh Rifai, and sheikh Hudhayfi (needs some minor tweaking). i am optimistic that some of the others you mentioned should work as well in sha' Allah.
Added 2 new timing zips here.
The timings for mishari_al_afasy_with_ibrahim_walk_english.zip were generated for these mp3 files (The link will work only for a few days). I created these files by simply joining the ayah files found on everyayah. Not sure if it is an upgrade over the current gapless audio, but it is there if needed.
Also, I tried generating timing files for sheikh Abdullah Basfar but the results were not always very good. Just letting you know so that you don't have to waste time.
jazakAllah khairan - awesome, barak Allah feek!!
for the files you generated, if they eliminate the pauses and sound smooth, then it's an upgrade. i will download and try it in sha' Allah.
for sheikh Basfar, i had ran the script but didn't test it yet (against both the archive version and the quranicaudio version) - the missing entries were few, but did not test the accuracy. i can test it against the quranicaudio version since i already generated the timing files.
i also have run for sheikh Muhammad Jibreel (but haven't validated the results yet) for sheikh Muhammad Ayoob, i plan on testing the timing files on everyayah first before running the script.
For sheikh Basfar, the mp3 files at https://archive.org/details/AbdullahAliBasfar/001.mp3 have some problem with the length in the encoding. That's why the timings did not seem right. Re-encoding them fixed the issue. The timings you generated for sheikh Basfar should work with these mp3 files (the link will only work for a few days). If your timing files don't work, you can use the timings I uploaded here.
Also, any update on the next release of the android app? jazakAllah khairan.
sorry for the late reply and jazakAllah khairan - i am traveling at the moment (but had my friend download the sheikh Basfar files for me so i can get them when i am back in sha’ Allah). are these also generated from the verse by verse files?
let’s aim for a release within the next 2 weeks in sha’ Allah.
No problem. The files are generated from https://archive.org/details/AbdullahAliBasfar/001.mp3. These are whole surah files, but due to some issue in their encoding, the length information in the mp3 was not correct. So I just re-encoded them. That sounds great. jazakAllah khairan.
sorry for being a bit slow about this. for sheikh Muhammad Jibreel, we're not getting all the lines unfortunately - for example:
~/Desktop/audio/mjibreel/timingFiles
❯ wc -l *
9 001.txt
97 002.txt
202 003.txt
24 004.txt
122 005.txt
141 006.txt
88 007.txt
77 008.txt
29 009.txt
111 010.txt
125 011.txt
53 012.txt
though on the bright side, the heuristic error rate seems low (6 sura files with 1 error each and a 7th with 15).
for sheikh Basfar - the QuranicAudio files didn't work very well (timings were way off). the link you sent (archive.org) work a lot better and are "ok," but aren't very good (in many cases, we're off 1-2 seconds - i only tested some of sura Mursalat and sura Naba') - usually an ayah starts 1-2 seconds too early. i haven't tried with your zip file mp3s yet though, but in sha' Allah will once my friend is back and i can get the files from him next week in sha' Allah.
Really appreciate the effort you put into the app. May Allah reward you for this. For sheikh basfar, the zip files I sent should work well (I corrected the mp3s from the link by re-encoding).
I will look into the issues for sheikh Jibreel insha'Allah.
I was able to generate timings for most of the surahs. There were a few errors but I corrected them manually. But for surah 4 and 11, there are problems in the surah mp3 files https://quranicaudio.com/quran/12.
Surah 4 Ayah 59 ends at 27:43. Then, ayah 88 starts at 27:46. Other ayahs in the middle are missing.
Surah 11 Ayah 83 ends at 24:51. Ayah 84 starts at 35:17. Something else plays in between.
I can manually fix the mp3 for surah 11. But not sure what to do for the missing part in surah 4. Will try to find the complete file online.
should i just have re-run them you think? for sura 4, mp3quran.net fills in sheikh Maher Mu'aiqly for the verses that are not there on their version (http://mp3quran.net/eng/jbrl_english.html).
the file at al sabeel seems okay though - http://www.assabile.com/muhammad-jibreel-59/collection/al-mushaf-al-murattal-53 (but it might be a different recitation).
also, may Allah reward you greatly for driving this and making it happen - may Allah give you the best in this world and the next!
No, re-running would not have helped. I made a change to the code to handle a 'None' exception when a location is not found in the surah (so that we can get all the lines). I will push the error handling code to the repository. I will try the mp3quran recitation to see if it works. I already tried the one on al sabeel, but it was a different recitation and did not work.
as an update, just tested the timing files for sheikh Muhammad Ayoob (tested parts of sura Baqarah, sura A'raf, and a few suras from the 30th juz'). sounded pretty spot on. the only exception is the last 18 ayahs of sura A'raf are missing (189-206).
but i think we're good to go with sheikh Muhammad Ayoob.
That's great! I added the timings for sheikh Muhammad Jibreel here. Manually corrected the errors. Also corrected the mp3s for surah 4 and 11. Here are the corrected files. For surah 4, I just removed the extra part. For surah 11, I got the missing verses from here and added them to the mp3 from quranicaudio. Everyayah uses this source for the missing verses.
jazakAllah khairan @anassaiyed - i replaced the 2 mp3s on the server with the ones you corrected (and moved the old ones into a "mistakes" directory).
i tested out the timings for sheikh Jibreel, and while they're decent, they're still a bit off in some cases (not noticeable if you're say listening and reading, but would be pretty notable in case of doing something like the repeat function). for example, try out sura Mursalat, you'll find several ayahs where the first letter is cut off.
do you think we can "algorithmically" fix these by detecting the silences and doing something like "if time detected by script is within x mills of a silence, snap to silence instead?"
also, somewhat related (?) - but i modified sheikh Muhammad Ayoob's sura A'raf to add the missing 18 ayahs and added timings for them. what i am seeing though is on Android, the timings are all off (despite reading correctly from the database, etc). is there anything i need to do to "fix" the mp3 so the seek times work properly? to clarify, the timings at the start of the file for example are off, despite the fact that i didn't change them (and only modified audio at the end of the mp3 without touching the start of it).
the updated file is here: https://download.quranicaudio.com/quran/muhammad_ayyoob/007.mp3
and the original file that is missing the 18 ayahs is here: https://download.quranicaudio.com/quran/muhammad_ayyoob/mistakes/007.mp3
i tried mp3val and it helped some, but things are still off. using Audacity, i can validate the timestamps are actually correct (ex 37153 for ayah 3). any idea what i might be missing here? jazakAllah khairan.
I checked the timings for surah Mursalat for sheikh Jibreel on my system (I use vlc's python bindings for that). I did not notice the letters being cut off (can you validate these timings too with audacity?). But I think sometimes different players can work differently. I re-encoded surah Mursalat. If that corrected the timings, I can re-encode and upload the other surahs.
For sheikh Muhammad Ayyoob's surah A'raf too, I re-encoded the file and the timings sound much better now. Let me know if that fixed it.
https://www.swisstransfer.com/d/4d254c0b-2c7c-491a-b39b-048ba68fceb1
really sorry for the late reply. i just tested these. sura A'raf for sheikh Ayoob works flawlessly now, jazakAllah khairan - how did you re-encode it so i can try this if i run into issues with other files in the future?
for sheikh Jibreel's sura Mursalat, i am still having the same issue. i tested with Audacity and the timings are spot on and correct. i can take another look at the gapless code to make sure that we're not overplaying in certain cases and going to the next file.
you can see this happen by playing the sura and setting each ayah to repeat once - most ayahs are fine, but some ayahs (ex 11, 12) seem to be cut off (though the timing is spot on in Audacity). could this also be related to encoding?
jazakAllah khairan!
just tried the sheikh Basfar files you had uploaded (and the timing files i generated). i am seeing something similar to sheikh Jibreel - where the times are correct in Audacity but playing them (especially hitting back to re-play an ayah or doing the "repeat each ayah once" option) has the same issue. it's particularly noticeable for example in sura Mursalat ayah 12, where the repeat cuts off the first 2 letters (though again, the timing is correct).
jazakAllah khairan.
2 more things:
Ibrahim Walk (alone) - i am downloading the files from the link you have sent (a bunch of MediaFire links) - but just want to make sure, your timing files ran against those mp3s specifically, right? (asking not just to make sure of alignment, but also as a "poor man's sanity check" to make sure that mp3s on a random website are what we think they are and have not been modified, etc) -- (update, just realized there's a "download all" at the bottom, only saw it after downloading 70ish of the files by hand 🤦♂)
for the sheikh Mishari with Ibrahim Walk - we didn't have this in the app before, but lots of people were asking for something like this. Sadly, i missed the link and didn't download this one. Can I just use the quranicaudio one?
jazakAllah khairan.
I'm sure all the above issues are due to encoding (looks good in audacity but timings are off by a bit when playing). Until now, I have used 2 methods to reencode the files.
The script basically uses the code below:
from pydub import AudioSegment
surah = AudioSegment.from_mp3('input.mp3')
audio.export('converted.mp3', format="mp3")
But seems like it doesn't work every time. I don't have the quran_android project set up on my computer so can't currently test the new mp3s in the app unfortunately. I test the mp3s using vlc but it seems that sometimes the timings work in vlc and not in the app.
For the sheikh Mishari with Ibrahim Walk you can use the quranicaudio link. That's what I used to generate the timings.
For Ibrahim Walk alone, sorry you had to download the individual mediafire files 😃. The mediafire links did not work for all the surahs from what I remember, so I created the surah files by joining the ayah files. I had uploaded the mp3s but the link expired. I uploaded them again (link below). The timings should work for these files.
For sheikh Jibreel's sura Mursalat, I encoded the file again. But this time I reencoded the files to use CBR(Constant bit rate) using freeac. The timings are correct with vlc, but not sure how they will work in the app. Link below.
If this solves the issue, the other files can be converted to use CBR too. If not, I will do a deeper dive into this issue.
https://www.swisstransfer.com/d/044ffd2e-a2af-4378-9049-de7efa05d96c
salam 3alaikum, sorry for the late reply. sadly, even with the newest file, i can still repro the same issue on an emulator running Android 28. I also tried on my phone (Pixel 2, Android 29), but ran into the same issue in the same places.
I have uploaded an apk here that has sheikh Jibreel as an option with gapless. all you'd need to do is make a directory (/sdcard/Android/data/com.quran.labs.androidquran.debug/files/quran_android/audio/mjibreel if you didn't give permissions or /sdcard/quran_android/audio/mjibreel if you did), copy the database there (it's in the same link as above) and you should be good to go. files will download form https://download.quranicaudio.com/quran/muhammad_jibreel/complete/, but you can replace files there to see their effect.
jazakAllah khairan for your help with this! i'll try sheikh Walk (alone and with sheikh Mishari) soon in sha' Allah.
salam 3alaikum, sorry for the long time no update. just tested sheikh Mishari with Ibrahim Walk and it works great from the handful of suras I tested. that's 4 shuyookh (Mishari, Hani Rifai, Muhammad Ayoob, and Mishari with Ibrahim Walk) ready for shipping.
in sha' Allah will test out the Ibrahim Walk gapless ones and fix the handful of Hussary files that need manual timing files. that then just leaves re-encoding sheikhs Basfar and Muhammed Jibreel.
jazakAllah khairan, walsalam 3alaikum.
Assalamu 'alaykum. I have some requests.
Can you add recitation of ayman suwaid in your Quran app? some tajweed learners recommend him for his tajweed. Info about him: http://www.assabile.com/ayman-swed-345/ayman-swed.htm His recitation in verse by verse format (gapped) is available here in zip form: http://quran.ksu.edu.sa/ayat/?pg=patches&l=en you can extract them and then add them in your Quran app & website in sha Allah. jazakAllahu khairan.
Can you add also recitations of muhammad luhaidan & abdulrahman jamal aloosi? info about them: http://www.assabile.com/muhammad-al-luhaidan-95/muhammad-al-luhaidan.htm http://www.assabile.com/abdulrahman-jamal-aloosi-360/abdulrahman-jamal-aloosi.htm regarding them, i still dont know of any gapped playback.