Chore: Multi-voice audio generation

homebrewltd / llama3-s

Llama3.1 learns to Listen

126 stars 4 forks source link

Open hahuyhoang411 opened 1 month ago

hahuyhoang411 commented 1 month ago

Motivation:

Current dataset is mono-voice (female) -> Model can't answer properly if the sound input is male's voice

Goal:

Side idea:

Overlap audio samples: e.g.
- Speaker 1: 100 samples
- Speaker 2: 100 samples (20 samples overlap with speaker 1)

hungphongtrn commented 1 month ago

I created a dataset with unique speaker as a reference for WhisperSpeech here: unique_speaker_audio

Basically, I sample data from openslr/librispeech_asr. For each unique speaker ID, I randomly get 1 sample.

We have 921 unique speakers. You guys can submit your voice to :))