Open coderschoolreview opened 5 years ago
Missing question 16
For reference: Regression Metric
Answer
from sklearn.metrics import mean_absolute_error,mean_squared_error
mae = mean_absolute_error = (y_test,y_predict)
mse = mean_square_error = (y_test,y_predict)
rmse = mse**.5
Good work, all answer are correct
Didn't work cause folder "data" didn't exist. Could fix this by creating folder data in same folder with the notebook.
lmplot
.The goal of this assignment was to introduce you following concepts in Machine Learning:
missingno
to visualizepandas.get_dummies
Almost got it right in Modeling and Evaluation, one minor mistake when define function evaluate model
, you should've write it like this:
# Import confusion_matrix, classification_report
from sklearn.metrics import classification_report, confusion_matrix
# We create an utils function, that take a trained model as argument and print out confusion matrix
# classification report base on X and y
def evaluate_model(estimator, X, y, description): #missing `description` argument
prediction = estimator.predict(X)
np.set_printoptions(precision=2)
model_name = type(estimator).__name__
return {'name': model_name,
'recall': recall_score(y, prediction),
'precision': precision_score(y, prediction),
'description': description}
Installing packages missingno
(or any arbitrary package) on Win10:
Anaconda Prompt
conda install -c conda-forge missingno
Filtering series:
Your code:train_copy.isnull()
Could write like this to get column with null value only:
ncols = train_copy.isnull().sum
ncols[ncols!=0]
Check if whole data frame have any null value:
train.isnull().any().any()
For evaluation function, should print
instead of return
so when you loop through list of model and evaluate them, the result for each iteration printed to output.
The goal of this assignment was to introduce you to following concepts:
You learn how to use PCA for dimension reduction, KMeans, and Hierarchical Clustering. Also you learn to visualize the result of both tenichque.
Could try this shorter version:
total_purchases = data.sum(axis=1)
purchase_percent = data.div(total_purchases, axis=0) * 100
Could write in this style for cleaner code & easier to read:
Overall most answer are correct, missing 2 questions :