RicardoP0 / Speech2dCNN_LSTM

A pytorch implementation of Speech emotion recognition using deep 1D & 2D CNN LSTM networks
GNU Affero General Public License v3.0
24 stars 3 forks source link
# Speech2dCNN_LSTM

Description

A pytorch implementation of Speech emotion recognition using deep 1D & 2D CNN LSTM networks using pytorch lighting and wandb sweep for hyperparameter finding. I'm not affiliated with the authors of the paper.

Example of spectogram image used as input

How to run

First, install dependencies

# clone project   
git clone https://github.com/RicardoP0/Speech2dCNN_LSTM.git

# install project   
cd Speech2dCNN_LSTM
pip install -e .   
pip install requirements.txt

Next, navigate to CNN+LSTM and run it.

# module folder
cd research_seed/audio_classification/   

# run module
python cnn_trainer.py    

Main Contribution

Results

Validation accuracy reaches 0.4 and a F1 value of 0.3 using 8 classes. Validation accuracy on 8 classes F1 on 8 classes