OpenPecha / stt-combine-datasets

MIT License
0 stars 0 forks source link

README

Note: This readme template is based on one from the Good Docs Project. You can find it and a guide to filling it out here. (Erase this note after filling out the readme.)


OpenPecha

STT Combine Datasets

Owner(s)

RFXs

Requests for work (RFWs) and requests for comments (RFCs) associated with this project:

Table of contents

Project descriptionWho this project is forProject dependenciesInstructions for useContributing guidelinesAdditional documentationHow to get helpTerms of use


Project description

STT Combine Datasets helps you combine datasets from three different sources: stt.pecha.tools, prodigy and saymore for mv. The data from these sources are combined into a single 04_combine_all.tsv file. Benchmark dataset is created from this combined tsv.

Who this project is for

This project is intended for STT data pipeline maintainer who wants to update the aggrigate STT/TTS dataset.

Project dependencies

Before using STT Combine dataset, ensure you have:

Instructions for use

Get started with STT Combine dataset by going through

Install STT Combine dataset

  1. Clone the repository

    git clone git@github.com:OpenPecha/stt-combine-datasets.git

  2. Create a virtual environment

    a. Create a virtual environment

    python3 -m venv .env

    b. Acticate the envronment

    source .env/bin/activate

  3. Install required packages

    pip install -r requirements.txt

Contributing guidelines

If you'd like to help out, check out our contributing guidelines.

Additional documentation

For more information:

How to get help

Terms of use

Project Name is licensed under the MIT License.