Open gsion1 opened 3 years ago
I did that, but the confusion matrix shows that there is an error somewhere, maybe in the confusion matrix itself and not the prediction
After the initila code, the confusion matrix is the one bellow (not a lot of errors)
Many lines should be removed but here is the code
#https://kgptalkie.com/human-activity-recognition-using-accelerometer-data/
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Flatten, Dense, Dropout, BatchNormalization
from tensorflow.keras.layers import Conv2D, MaxPool2D
from tensorflow.keras.optimizers import Adam
#print(tf.__version__)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
file = open('WISDM_ar_v1.1_raw.txt')
#file = open('jogging.txt')
lines = file.readlines()
processedList = []
for i, line in enumerate(lines):
try:
line = line.split(',')
last = line[5].split(';')[0]
last = last.strip()
if last == '':
break;
temp = [line[0], line[1], line[2], line[3], line[4], last]
processedList.append(temp)
except:
print('Error at line number: ', i)
columns = ['user', 'activity', 'time', 'x', 'y', 'z']
data = pd.DataFrame(data = processedList, columns = columns)
data.head()
data.shape
data.info()
data.isnull().sum()
data['activity'].value_counts()
data['x'] = data['x'].astype('float')
data['y'] = data['y'].astype('float')
data['z'] = data['z'].astype('float')
data.info()
Fs = 20 #sampling rate in Hz
activities = data['activity'].value_counts().index
df = data.drop(['user', 'time'], axis = 1).copy()
df.head()
df['activity'].value_counts()
label = LabelEncoder()
df['label'] = label.fit_transform(df['activity'])
df.head()
X = df[['x', 'y', 'z']]
y = df['label']
scaler = StandardScaler()
X = scaler.fit_transform(X)
scaled_X = pd.DataFrame(data = X, columns = ['x', 'y', 'z'])
scaled_X['label'] = y.values
scaled_X.head()
import scipy.stats as stats
#divide the samplings in 4s frames
Fs = 20 #sampling is 20Hz
frame_size = Fs*4 # 80samples, 4 secondes
hop_size = Fs*2 # 40
def get_frames(df, frame_size, hop_size):
N_FEATURES = 3
frames = []
labels = []
for i in range(0, len(df) - frame_size, hop_size):
x = df['x'].values[i: i + frame_size]
y = df['y'].values[i: i + frame_size]
z = df['z'].values[i: i + frame_size]
# Retrieve the most often used label in this segment
label = stats.mode(df['label'][i: i + frame_size])[0][0]
frames.append([x, y, z])
labels.append(label)
# Bring the segments into a better shape
frames = np.asarray(frames).reshape(-1, frame_size, N_FEATURES)
labels = np.asarray(labels)
return frames, labels
X, y = get_frames(scaled_X, frame_size, hop_size)
#still match with the right labels
X.shape, y.shape
X = X.reshape(X.shape[0],80,3,1)
X.shape
import keras
model = keras.models.load_model('model.h5')
from mlxtend.plotting import plot_confusion_matrix
from sklearn.metrics import confusion_matrix
print(model.predict(X))
y_pred = np.argmax(model.predict(X), axis=-1)
print(y_pred)
label_list = ["Walking", "Jogging", "Upstairs", "Downstairs", "Sitting", "Standing"]
draw_mat = 1
if draw_mat:
mat = confusion_matrix(y, y_pred)
plot_confusion_matrix(conf_mat=mat, class_names=label_list, show_normed=True, figsize=(7,7))`
Seems like the model is overfitting if it is for training data otherwise it's a good model.
I'm not sure to understand what you are saying ? Because the model used in the second script is the the model saved from the original script with the training and testing
Do you have an idea to solve the issue?
I've tried with 10, 15 and 50 epochs but the results are not satisfying for my case in all the cases
Admittedly, I did not check the details of the dataset from the turorial, but my question would be: Did you record your test data under the same circumstances as the dataset from the tutorial (i.e. sensor position, etc.)? You want the trained model to classify your own test data right? If your data was recorded in a different manner, then I guess you have to retrain the model on your own data first.
This shows only the process of model traning. You have to train on your dataset for correct result.
On Mon, 5 Apr 2021 at 12:32 AM, kobe28 @.***> wrote:
Admittedly, I did not check the details of the dataset from the turorial, but my question would be: Did you record your test data under the same circumstances as the dataset from the tutorial (i.e. sensor position, etc.)? You want the trained model to classify your own test data right? If your data was recorded in a different manner, then I guess you have to retrain the model on your own data first.
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/laxmimerit/Human-Activity-Recognition-Using-Accelerometer-Data-and-CNN/issues/1#issuecomment-813083648, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD7QW5BZ5SBEOHN5P6QF6N3THCZSZANCNFSM4Y6G4IIQ .
Admittedly, I did not check the details of the dataset from the turorial, but my question would be: Did you record your test data under the same circumstances as the dataset from the tutorial (i.e. sensor position, etc.)? You want the trained model to classify your own test data right? If your data was recorded in a different manner, then I guess you have to retrain the model on your own data first.
I did not used my own files but the ones from the tutorial, so this cannot be the source of the issue I think
I did not used my own files but the ones from the tutorial, so this cannot be the source of the issue I think
Ah okay, so you took the whole dataset from the course and then you ran the model on it. We should excpect the model to do very well, but it did not. I did not check for mistakes in your code, but the confusion matrix looks odd indeed. The classes are very unbalanced and this makes sense since you took the whole dataset, but initially "Sitting" was the smallest class with 3555 data points if I recall correctly, but in this case "Downstairs" and "Upstairs" are considerably smaller. Maybe you got the data somehow mixed up?
This is what I think but I'm not able to spot the issue Labels may also be mixed up but I can't find the error
Because the tutorial and my code use the same model, the confusion matrix and the predictions should be far better, this issue might appear when displaying the results
Hi, Thanks for this tutorial. Could you add an example describing how to predict a small batch of 80 x,y,z data with the saved model please? Thanks a lot Guillaume