Open coderschoolreview opened 5 years ago
The goal of this assignment is for you to learn to build and evaluate a model using scikit-learn library in Python. You learn how to do Data pre-processing, splitting it into training and test sets, training your model using training sets and evaluating its performance on test sets.
Things you did well:
Discovering the usage of many new functions!
Understanding how to use data to train a model using library in Python. This is a very important step!
Some minor tips:
Question 3: You should set "annot=True" when using heatmap in order to display coefficient values: sns.heatmap(data.corr(), annot=True)
You should take more features into your X (before splitting it), instead of just 1 feature “Avg. Session Length”.
Question 14: You must save your prediction to y_prediction: y_prediction = lm.predict(X_test). So that you can use it later, for example when you use scatter: plt.scatter(y_test, y_prediction).
The Mean Absolute Error(MAE), Mean Squared Error(MSE), Root Mean Squared Error(RMSE) could be used like below: metrics.mean_absolute_error(y_test, y_prediction), metrics.mean_squared_error(y_test, y_prediction), np.sqrt(metrics.mean_squared_error(y_test, y_prediction))
In the “Residual” section, you count the different between your y_test and y_prediction, and then put it into graph to see whether it is a normal distribution. So: residuals = y_test - y_prediction, and then plot it using: sns.distplot(residuals,bins=10)
Hi Hiep Pham,
It seem like you are getting familiar with sentiment analysis, but still have trouble finishing your assignment.
The reason for that issue is in your " def preprocessor(text): " function, you name the emotion icon collection "emotions" but use "emoticons" later in this code:
The goal of this assignment was to introduce you to following concepts:
You learn how to use PCA for dimension reduction, KMeans, and Hierarchical Clustering. Also you learn to visualize the result of both tenichque.
You're stuck at this question because you initiated PCA with 2 components but fit_transform X with 6 component (V1,... V6) and (PC1,...,PC6)
Goal of this Assignment
The goal of this assignment was to introduce you to 2 main concepts in Machine Learning:
You learn how to query and clean data using pandas library in Python, make some plots which help to understand more about data with Seaborn library.
Things you did well:
Things to work on:
ecom['Language']=='en']
already returns the Series of True or False, further step is performing summation of how many True records are there by usingsum()
: sum (ecom['Language']=='en']).To sum up: