ChakshuGautam / whisper-hinglish

1 stars 0 forks source link

SOTA ASR for Hinglish #1

Open ChakshuGautam opened 6 months ago

ChakshuGautam commented 6 months ago

Description

The transcript will be en for English words and hi for Hindi words. Using Whisper/Fairseq. Also, an alternate model that gives transcript in 100% hi or 100% en.

Why

  1. Custom Tokenizer for Whisper/Fairseq in different ways for the same audio.
  2. Public Datasets for all three variants.

Building datasets

Training

xorsuyash commented 6 months ago

@ChakshuGautam we will have English audio-transcript pair as well as hindi audio-transcript pair for force alignment ?

rayaanoidPrime commented 6 months ago

@ChakshuGautam we will have English audio-transcript pair as well as hindi audio-transcript pair for force alignment ?

Hey man @xorsuyash , can you link your discord? Wanted to collaborate with you on some topics : )

harshaharod21 commented 6 months ago

Hi I'm Harsha. I'll be happy to contribute to this project. I went through the complete description given above and have understood the tasks and task flow. As I can see few of the tasks here are assigned to other contributors, on which task can I start working on? Should I start working on tasks in the issue 2 ? Also we should we working to find datasets of both types right, that is, Mixed language Transcription and Monolingual transcription?