Note: This readme template is based on one from the Good Docs Project. You can find it and a guide to filling it out here. (Erase this note after filling out the readme.)
Requests for work (RFWs) and requests for comments (RFCs) associated with this project:
Project description • Who this project is for • Project dependencies • Instructions for use • Contributing guidelines • Additional documentation • How to get help • Terms of use
STT Combine Datasets helps you combine datasets from three different sources: stt.pecha.tools, prodigy and saymore for mv. The data from these sources are combined into a single 04_combine_all.tsv file. Benchmark dataset is created from this combined tsv.
This project is intended for STT data pipeline maintainer who wants to update the aggrigate STT/TTS dataset.
Before using STT Combine dataset, ensure you have:
Get started with STT Combine dataset by going through
Clone the repository
git clone git@github.com:OpenPecha/stt-combine-datasets.git
Create a virtual environment
a. Create a virtual environment
python3 -m venv .env
b. Acticate the envronment
source .env/bin/activate
Install required packages
pip install -r requirements.txt
If you'd like to help out, check out our contributing guidelines.
For more information:
Project Name is licensed under the MIT License.