OpenPecha / stt-catalog-merger

MIT License
0 stars 0 forks source link

STT0056: prepare training data for colloquial dataset in hugging face. #10

Open gangagyatso4364 opened 1 month ago

gangagyatso4364 commented 1 month ago

Description

The task is to prepare training dataset for colloquial transcription model that will transcribe the spoken text into a written text.

Completion Criteria

a training dataset in hugging face ready for use in model training.

gangagyatso4364 commented 1 month ago

share with yash, gaychay and cc to others developers

gangagyatso4364 commented 1 month ago

COLLOQUIAL DATASET LINK: https://huggingface.co/datasets/ganga4364/Colloquial_training_data