Audio-Visual-Emotion-and-Sentiment-Research
Deep Neural Network and its application with TensorFlow project
Members and Corresponding parts:
Audio parts: Enis Berk Çoban and Yunhua Zhao
We use the audio-song, audio-speech, audio-song-speech files separately to train models to detect emotion. Also we use different NN architectures (LSTM, CNNs) along with pre-trained models for transfer learning (VGGish).
Video parts: Patrick Jean-Baptiste
We use images of actors' faces that express an emotion. The images are extracted from the video only and audio-visual files of the RAVDESS dataset for both speech and song. The objective is to create a visual model to recognize emotions from images.
Period of our project:
- Explore and pre-process the dataset:\
Enis extractes audio from video;\
Yunhua decode the filenames;\
Patrick extract images from the video files
- \
Yunhua use LSTM to train models on audio-song files to get the accuracy; then use same model to the audio-speech files to get the accuracy; then use same model to the audio-song-speech model.\
Enis generated VGGish embeddings to train the model on audio-song-speech files.\
Patrick does initial video preprocessing.
- \
Enis split the dataset into train, validation, test sets, and made a csv file, so that everyone could use the same sets for training and we can merge models or their outputs.\
- \
Patrick detects and extracts the actors' faces from the images.
- \
Enis tried reproducing Yunhua's results and discovered a bug,\
Yunhua tried Enis' model output with Yunhua's model;\
Patrick trains a model on the split dataset to do the visual emotion classification
- \
Enis provided several organized files and leads us to move on.
- \
Enis trained a model which has a module for each type of input.
Yunhua merged the original features of audio and video and put the merged features to one dense layers model.
- We prepared the presentation.
File organization:
- We use the "Issues" to track some problems such as
- Everybody completed their experiments on notebooks such as:
- We created some scripts that handles data processing functions such as:
- We also have a folder for assets such as models or embedding files:
- Our email list
- dataset info