Closed smalissa closed 3 years ago
Long Aya recordings causes the training to be veery slow. You Should Trim off the long files. At the mentioned code snippet, you select the cutoff threshold by filesize in bytes. To be more human readable, some selected percentages are converted to bytes.
I Hope It's Clear To You.
Thank you , but It is not very clear, could you please explain in a simpler way. I know the idea and the goal of the trimming step, but I can't understand what these values mean clearly in more details? for example in '100p': 9999999, the value 100 indicates to what, p symbol also and the value 100 too? Also, what is "eval_threshold = 0.15" means, indicates eo what?
Also , what about the second question above about the data mixing?
@tarekeldeeb @aibrahim- @omerasif-itu
Assalam Alaikum please if you give me a clarification for these questions :
In Imam + filtered users dataset that’s used: I need to know how the mixing of Imama data and Tarteel Data did? Is it through by padding the Tarteel_user Train file to the Imam Train file and the same thing for the dev file and test file, or there's a specific mechanism?
I know the idea and the goal of trimming off the long files, but I need to understand what these values mean clearly:
for example in '100p': 9999999, the value 100 indicates to what, p symbol also and the value 100 too? Also, what is "eval_threshold = 0.15" means, indicates to what?
Thank you in advance.
100p is a short hand for 100% of the data. You are also provided by 70%, .. etc. The numerical value is the file size in bytes. Remember, the data is lossless, the longer the recording the larger the file size.
@tarekeldeeb @aibrahim- @omerasif-itu
Assalam Alaikum please anyone helps me in answering these questions. I understand the work that did in Tarteel data and Imams Data, I understand the work workflow and i could run the code, but I have missunderstanding in these points:
I know this step to build a CSV file with 70% of the tarteel_users, but I need to understand what these numbers mean exblicitly.
Thank you in advance.