voidful / SpeechMix

Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together
42 stars 10 forks source link

SpeechMix

Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together.

Implementation of:
Large-Scale Self- and Semi-Supervised Learning for Speech Translation - ACL2021
Multilingual Speech Translation with Efficient Finetuning of Pretrained Models - ACL2021
Lightweight Adapter Tuning for Multilingual Speech Translation - Interspeech 2021
Improving Speech Translation by Understanding and Learning from the Auxiliary Text Translation Task - ACL2021
A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks - ICASSP 2021

Installation

pip install

pip install speechmix

Build from source

git clone and cd into this project.

pip install -e .

Name the project(!important)

WANDB_PROJECT=amazing

base

python train.py --speech_model_config wav2vec2 \
--nlp_model_config facebook/bart-base \
--SpeechMixEED \
--dataset librispeech_asr \
--field clean \
--train_split train.100 \
--test_split validation \
--batch 3 \
--grad_accum 20 \
--epoch 30 \
--worker 15 \
--share_layer_ratio 0 \
--down_scale 2 \
--lr 4e-5 \
--warmup_steps 500 \
--wandb \
--notes base