Note: This readme template is based on one from the Good Docs Project. You can find it and a guide to filling it out here. (Erase this note after filling out the readme.)
Change to the owner(s) of the new repo. (This template's owners are:)
Requests for work (RFWs) and requests for comments (RFCs) associated with this project:
Project description • Who this project is for • Project dependencies • Instructions for use • Contributing guidelines • Additional documentation • How to get help • Terms of use
stt-split-audio helps you feed audio segments to stt.pecha.tools for annotation.
This project is intended for STT training data manager who wants to supply audio segments for annotation.
Before using stt-split-audio, ensure you have:
Get started with stt-split-audio by checking the catalog for a department and checking what id range to upload to the stt.pecha.tools.
Get the Google Cloud Client secret to download files from Google Drive
Login to Google Cloud Console and create a project. Click "API and services" > "Credentials" > "OAuth 2.0 Client IDs" > Download "Client secret" and rename it to credentials.json Upload the credentials.json file to util folder in stt-split-audio
(Optional: Include a code sample or screenshot that helps your users complete this step.)
Install ffmpeg on EC2 if you are using EC2 Amazon Linux Use this link to install ffmpeg After following the steps from the above link also run
ln -s /usr/local/bin/ffmpeg/ffprobe /usr/bin/ffprobe
Login to aws cli with
aws configure
Database credentials Create an .env file in util with the following environment variables
Ways to run the script. Sample Commands for Individual Steps(for example youtube link download from new catalog):
first step cd to util directory: cd util
Download YouTube videos and split into audio:
bash code:
python ../audio_download_and_split/yt_download.py --config ../json_config/nw_config.json
Run inference on audio segments:
bash code:
python ../inference_runner/run_inference.py --config ../json_config/nw_config.json
Generate CSV from inference results:
bash code:
python ../make_db_csv/make_csv.py --config ../json_config/nw_config.json
How to Run the Entire Flow(for example youtube link download from new catalog): To run all the steps in one go, use your run_all.sh script, which executes each of the three key Python scripts (yt_download.py, run_inference.py, make_csv.py) sequentially. The script checks for errors after each step and stops if an error is encountered.
Ensure your environment is set up:
Google Cloud credentials are in place. Required Python libraries are installed. AWS CLI is configured and authenticated. go to util directory: cd util/ Run the shell script: ../run_all.sh
The transfer_text function aligns and transfers annotations from predicted text (in a TSV file) to the original text (in a text file). The output is a DataFrame containing transferred annotations, ensuring a one-to-one correspondence between the predicted and original text.
If you'd like to help out, check out our contributing guidelines.
stt-split-audio is licensed under the MIT License.