skhiearth / Coursera-IBM-Machine-Learning-with-Python-Final-Project

The following algorithms are used to build models for the different datasets: k-Nearest Neighbour, Decision Tree, Support Vector Machine, Logistic Regression The results is reported as the accuracy of each classifier, using the following metrics when these are applicable: Jaccard index, F1-score, Log Loss. This project counts towards the final grade of the course.
17 stars 21 forks source link

Coursera-IBM-Machine-Learning-with-Python-Final-Project #1

Open swatisolanki2406 opened 4 years ago

swatisolanki2406 commented 4 years ago

Screenshot (85) I'm facing an issue in Load Test set for evaluation. How can I solve it??

thebuffdude commented 4 years ago

I am facing the same issue

excalibur768 commented 3 years ago

The error is due to one hot encoding not being done for the gender column as it was done on the training data, use the below updated code it should resolve this issue

convert date time

test_df['due_date'] = pd.to_datetime(test_df['due_date']) test_df['effective_date'] = pd.to_datetime(test_df['effective_date']) test_df['dayofweek'] = test_df['effective_date'].dt.dayofweek

evaulate weekend field

test_df['weekend'] = test_df['dayofweek'].apply(lambda x: 1 if (x>3) else 0)

work out education level

test_df['Gender'].replace(to_replace=['male','female'], value=[0,1],inplace=True) test_feature = test_df[['Principal','terms','age','Gender','weekend']] test_feature = pd.concat([test_feature,pd.get_dummies(test_df['education'])], axis=1) test_feature.drop(['Master or Above'], axis = 1,inplace=True) test_feature.head()

normalize the test data

test_X = preprocessing.StandardScaler().fit(test_feature).transform(test_feature) test_X[0:5]

and target result

test_y = test_df['loan_status'].values test_y[0:5]