Open wxindu opened 5 years ago
Lab #2 Feedback
✔️ Mostly fine work. Think again about the last simulation problem (from 9-12).
Exercise 1 ~ 6 Perfect
Exercise 7 Be careful, question 2 should be inference. Question 4 should be data description.
Exercise 9-12 A few things:
rnorm()
set mean to be 0 and sd to be 1. Is it reasonable to assume that mag
center at around 0? Try with different normal distribution. Lab #4 Feedback
✔️ ➕ Good job overall!
Lab part: Looks good
Problem Set:
Exercise 2 a) Good b) Good
Exercise 3 a) Yes, the training RSS will steadily decrease as s increases until the point where the LS estimates are reached, and from there, the training RSS will remain unchanged with any further increase in s. b) Yes, the testing RSS will decrease as s increases up until the point where the optimal complexity is reached. It will from there increase again. And once the LS estimates are reached, the testing RSS will remain unchanged. c) Good. The variance will increase steadily until the LS estimates are reached, at which point it will remain unchanged. d) Good. The bias will decreases steadily until LS estimates are reached and it will remain unchanged, e) Good.
Exercise 4
a) Correct.
b) Good
c) Good
d) Good
e) Good
Exercise 5 NA
Exercise 6 good job
Lab #5 Feedback
✔️ Review non-linear regression models.
Problem 1
Please review non linear regression models and how to include quadratic terms in a model: make sure you always include the original variable, not only its quadratic. You need + export + I(exports^2)
.
Problem 2
Be careful, you would want to change the schooling back to its original value before changing exports. Otherwise your steps look right, you can try to use filter()
and mutate()
to subset and modify your dataset. Try to fix the model in problem 1 and redo this problem, see if the results make better sense.
2.3
Good. Also consider the response variable of a logistic regression model. As the value of predictors change, the response variable change exponentially.
Problem 3 Your steps look right. Redo the model in problem 1 and try again.
Problem 4 You have the same problem in these models as in problem 1. Always include both the original variable and its quadratic form in your model.
Problem Set
Problem 4 a) Good b) Good c) Good d) Good e) Good
Problem 6 a) Good b) Good
Problem 7 Good job
Lab #6 Feedback
✔️ ➖ Review bootstrapping. Review confidence intervals. Be careful with your code, whenever you run into errors and cannot figure out why it happen please ask for help.
Inventing a variable
partition_index
by doing the following code:
d <- d %>% mutate(fold = sample(1:5, nrow(d), replace = TRUE)
Collaborators? Inverting a variable
A simple model
I don't understand your work here. Read the question again. Here the estimates of return is simple 1/MAPE
.
Is simple sufficient? Your code for bootstrap looks right. Think again about what's the parameter of interest here. Try not to use other packages to help you find confidence intervals. Construct them by yourself. Review how to construct a confidence interval using bootstrapping. What's the margin of error? Your code does not work here which is one of the reason why the file cannot knit.
One big happy plot
You are missing a +
sign in your code here so your code returns an error. Fix that and try again.
The big picture Missing
Exercise Missing
Lab #7 Feedback
✔️ Good
Problem 1
Problem 2 Good
Problem 3 Good job
Problem 4 Good.
Problem 5 Good. Try not to print out all data.
Problem 6
You can specify n.var =
in your varImpPlot()
to display only the first several results on the plot.
Lab #8 Feedback
✔️ ➕ Very good work
Building a boosted tree Good job
Assessing predictions
Slow the learning Very nice work.
Communities and Crime Good. One little thing, Andrew should have provided both the training set and the testing set. The dataset you are using here was supposed to be the training set. Be careful. Your steps look correct for subseting the training set and testing set though.
Chapter 8 Exercises
Lab #1 Feedback
✔️ Good work overall
Exercise 1 Good
Exercise 2 Good. I may organize your code slightly different than what you did here. let me show you: As you may see in your pdf file, your
labs()
can be a bit too long, and it may still have the risk of being cut off. Try to organize your code like what I did here and see if you like it.Exercise 3 Good
Exercise 4 Good. You can also make a histogram on
crim
to visually see if there are extreme valuesExercise 5 Good
Exercise 6 Good
Exercise 7 How would the model "apply the correlations to predict the average value of a home"? Please provide more description of how this can be done. Also, how would you consider which variables to be included in your model? How would the correlation between different predictors affect your decision on which ones to include?