A pytorch implementation of Speech emotion recognition using deep 1D & 2D CNN LSTM networks using pytorch lighting and wandb sweep for hyperparameter finding. I'm not affiliated with the authors of the paper.
First, install dependencies
# clone project
git clone https://github.com/RicardoP0/Speech2dCNN_LSTM.git
# install project
cd Speech2dCNN_LSTM
pip install -e .
pip install requirements.txt
Next, navigate to CNN+LSTM and run it.
# module folder
cd research_seed/audio_classification/
# run module
python cnn_trainer.py
Validation accuracy reaches 0.4 and a F1 value of 0.3 using 8 classes.