The-Data-Alchemists-Manipal / MindWave

MindWave is an open-source project designed for beginners to learn about data science, machine learning, deep learning, and reinforcement learning algorithms using Python. The project offers a platform for implementing relevant algorithms, with open-source tools and libraries.
MIT License
96 stars 145 forks source link

GSSOC '23: Speech Emotion Recognition using Deep Learning Network Flow #565

Closed sujanrupu closed 1 year ago

sujanrupu commented 1 year ago

From a machine learning perspective, speech emotion recognition is a classification problem where an input sample (audio) needs to be classified into a few predefined emotions. Of course, the challenge in this problem goes beyond technical – how does one even define emotion and consistently decide the class given an audio sample that can be ambiguous to even humans?

The issue is more pressing for dataset creators, but it also becomes essential while evaluating a trained model. Further below, we will see that our dataset contains two similar-sounding emotions, “calm” and “neutral,” which can be tricky for even humans to ascertain in ambiguous cases. Meanwhile, “angry” and “happy” have prominent differences that the model can quickly learn.

So, it is clear that machine learning models need to delve deeper into the feature extraction and non-linearity of the audio signals to effectively capture the nuanced differences in speech that humans can detect intuitively. Currently, researchers work with audio signals by treating them either as time-series data or using spectrograms to generate numeric and image forms of the audio. All these techniques involve some or the other kind of transformation to the original data, thus making feature loss likely. There is still a need to make machine learning models robust at learning features from audio data – robustness in classification or generation tasks will follow.

khusheekapoor commented 1 year ago

@sujanrupu - you can go ahead! We are assigning you 21 days for this project, after which it will be assigned to someone else if not completed. All the best! Name the file as: algorithm_dataset.ipynb and link it in the readme of the labeled directory as algorithm - dataset.