SpeechEmoRec
Introduction
This project aims to implement speech emotion recognition strategy proposed in Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching
Runtime enviorment
CPU Host :
- ubuntu16.04
- python3.5
- tensorflow1.7.0
GPU Server :
- tensorflow-gpu1.7.0
- NVIDIA driver version:390
- cuda9.0
- cudnn7.0
Instructions
Preprocessing Data
- Update path of dataset which you want to save from path.py
- Downloading Berlin Database of Emotional Speech!
- Berlin Dataset
$ python load_emodb.py
- eNTERFACE Dataset
Downloading the eNTERFACE05 Dataset and update the dataset root
-
Starting preprocessing
$ python melSpec.py
Feature Extracting
Finetune AlexNet with Tensorflow
$ python finetune.py
Discriminant Temporal Pyramid Matching
$ python dtpm.py -s
$ python dtpm.py -n
Classfier
Support Vector Machine
$ python svm.py
Refrences:
Refrence Model:
Refrence Papers:
- ImageNet Classification with Deep Convolutional
Neural Networks
- Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching
- Geometric ℓp-norm feature pooling for image classification