Closed hfboyce closed 3 years ago
This is @kvarada feedback that is coming from the other repo found here.
Hi @hfboyce I see that you have put a lot of work in this module!
Here are my comments for improvement:
scikit-learn
? nn.kneighbors([[-80, 25]])
--> Nice! euclidean_distances
function. What would happen if we didn't use fill_diagonal()
? Snoodle
, shouldn't have the target value in it. So it should be:
[[53, 77, 43, 69, 80, 57, 379]]
print("The training score is %.4f" %(train_score))
Up to which value of
n_neighbors
is there overfitting?
n_neighbors
? Incorrect. The points (2, 2), (5, 2) and (4, 3) are the closest to (0, 0) and so we must take the average of all the values. You got it! We must take the average of the 3 nearest examples.
If you want to use k=3, the average would be 0.333, which is not in the options. Huh! Long module! Thanks for putting in all this work.
Will look at the assignment either tomorrow or Thursday if that's OK.
4.8 What are x_1, x_0 and y_1 and y_0? It might be helpful for the students to clearly define them. 4.8 I like that you are making this connection. But I worry that it might take a bit long to actually explain this and not sure if it's necessary.
Should I keep in or remove?
7 Calculating Euclidean distances by "Hand" --> Probably I would call it step by step instead of by hand because they are actually coding it up. In general, this exercise is fine but not sure if we need this after the first few exercises.
Do you think it's ok if I keep it?
8 Do we show them anywhere how to calculate Euclidean distance using scikit-learn?
Yes! In deck 4 slide 7.
11.2 Finding the distances to a query point takes double the time as finding the nearest neighbour. Not sure if I understand this question.
I've changed it to Calculating the distances between an example and a query point takes twice as long as calculating the distances between two examples.
13.2 Minor suggestion: How about using a different colour for gray? Something that stands out?
These are images I stole from Mike so I don't have the source code to change them unfortunately.
- Is it possible to get a bit better plots here. I guess you want to show that SVM decision boundaries are smoother, right?
I'll add this on the wish list?
- I think the bad results are due to scaling and both k-NN and SVM RBF should suffer because of that.
Should I make any changes to it?
@hfboyce
Here she is! This was a bit more time consuming for me but hopefully it's ok.
https://intro-machine-learning.netlify.app/en/module4
Please be as thorough as you like. Missing notes in the transcript are intensional so we can fill it in with what you end up talking about.
I'll be working on the assignment all day tomorrow so that should be on your way before Monday, which is when I will be starting Module 5 (hopefully on track).
(Also no need to rush, Elijah is still working on Assignment 3) This module should have 28 exercises.