OpenPecha / stt-split-audio

MIT License
0 stars 0 forks source link

README

Note: This readme template is based on one from the Good Docs Project. You can find it and a guide to filling it out here. (Erase this note after filling out the readme.)


OpenPecha

stt-split-audio

Owner(s)

Change to the owner(s) of the new repo. (This template's owners are:)

RFXs

Requests for work (RFWs) and requests for comments (RFCs) associated with this project:

Table of contents

Project descriptionWho this project is forProject dependenciesInstructions for useContributing guidelinesAdditional documentationHow to get helpTerms of use


Project description

stt-split-audio helps you feed audio segments to stt.pecha.tools for annotation.

Who this project is for

This project is intended for STT training data manager who wants to supply audio segments for annotation.

Project dependencies

Before using stt-split-audio, ensure you have:

Instructions for use

Get started with stt-split-audio by checking the catalog for a department and checking what id range to upload to the stt.pecha.tools.

Install stt-split-audio

  1. Get the Google Cloud Client secret to download files from Google Drive

    Login to Google Cloud Console and create a project. Click "API and services" > "Credentials" > "OAuth 2.0 Client IDs" > Download "Client secret" and rename it to credentials.json Upload the credentials.json file to util folder in stt-split-audio

    (Optional: Include a code sample or screenshot that helps your users complete this step.)

  2. Install ffmpeg on EC2 if you are using EC2 Amazon Linux Use this link to install ffmpeg After following the steps from the above link also run

    ln -s /usr/local/bin/ffmpeg/ffprobe /usr/bin/ffprobe

  3. Login to aws cli with

    aws configure

  4. Database credentials Create an .env file in util with the following environment variables

    • HOST
    • DBNAME
    • DBUSER
    • PASSWORD
  5. Ways to run the script. Sample Commands for Individual Steps(for example youtube link download from new catalog):

    • first step cd to util directory: cd util

    • Download YouTube videos and split into audio:

      bash code:

      python ../audio_download_and_split/yt_download.py --config ../json_config/nw_config.json
    • Run inference on audio segments:

      bash code:

      python ../inference_runner/run_inference.py --config ../json_config/nw_config.json
    • Generate CSV from inference results:

      bash code:

      python ../make_db_csv/make_csv.py --config ../json_config/nw_config.json

      How to Run the Entire Flow(for example youtube link download from new catalog): To run all the steps in one go, use your run_all.sh script, which executes each of the three key Python scripts (yt_download.py, run_inference.py, make_csv.py) sequentially. The script checks for errors after each step and stops if an error is encountered.

    Ensure your environment is set up:

    Google Cloud credentials are in place. Required Python libraries are installed. AWS CLI is configured and authenticated. go to util directory: cd util/ Run the shell script: ../run_all.sh

    implementation flow

    image

Transfer Text Function

Overview

The transfer_text function aligns and transfers annotations from predicted text (in a TSV file) to the original text (in a text file). The output is a DataFrame containing transferred annotations, ensuring a one-to-one correspondence between the predicted and original text.

Contributing guidelines

If you'd like to help out, check out our contributing guidelines.

How to get help

Terms of use

stt-split-audio is licensed under the MIT License.