MiteshPuthran / Speech-Emotion-Analyzer

The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
MIT License
1.3k stars 438 forks source link

Inference code? #50

Open arianaa30 opened 4 years ago

arianaa30 commented 4 years ago

Hi, Thanks for the nice work. I was trying to just use your model for inference. I looked at the notebook and copied the necessary parts for inference, but get error This LabelEncoder instance is not fitted yet. Can you help what is missing in this code?

import os

from keras import regularizers
import keras
from keras.callbacks import ModelCheckpoint
from keras.layers import Conv1D, MaxPooling1D, AveragePooling1D, Dense, Embedding, Input, Flatten, Dropout, Activation, LSTM
from keras.models import Model, Sequential, model_from_json
from keras.preprocessing import sequence
from keras.preprocessing.sequence import pad_sequences
from keras.preprocessing.text import Tokenizer
from keras.utils import to_categorical
import librosa
import librosa.display
from matplotlib.pyplot import specgram
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import LabelEncoder

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import tensorflow as tf

opt = keras.optimizers.rmsprop(lr=0.00001, decay=1e-6)
lb = LabelEncoder()

json_file = open('model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
# load weights into new model
loaded_model.load_weights("saved_models/Emotion_Voice_Detection_Model.h5")
print("Loaded model from disk")

X, sample_rate = librosa.load('h04.wav', res_type='kaiser_fast',duration=2.5,sr=22050*2,offset=0.5)
sample_rate = np.array(sample_rate)
mfccs = np.mean(librosa.feature.mfcc(y=X, sr=sample_rate, n_mfcc=13),axis=0)
featurelive = mfccs
livedf2 = featurelive
livedf2= pd.DataFrame(data=livedf2)
livedf2 = livedf2.stack().to_frame().T
twodim= np.expand_dims(livedf2, axis=2)
livepreds = loaded_model.predict(twodim, batch_size=32, verbose=1)

livepreds1=livepreds.argmax(axis=1)
liveabc = livepreds1.astype(int).flatten()
livepredictions = (lb.inverse_transform((liveabc)))
print(livepredictions)
WenjingZhang0214 commented 3 years ago

You can find the emotion index list from here .You just need the variance liveabc, not the LabelEncoder instance lb.

Aayush795 commented 3 years ago

@WenjingZhang0214 https://github.com/DimensionNXG/Speech-Emotion-Analyzer/blob/master/inference.py