UBC-MDS / introduction-machine-learning

Other
3 stars 5 forks source link

Module 2 Feedback #19

Closed hfboyce closed 3 years ago

hfboyce commented 3 years ago

Please see the deployed link here. @kvarada it's important that we are both seeing the same version as sometimes your browser will not update and it can be very frustrating. Make sure to clear to cookies for this site from now on. This module should have 21 "exercises" (this is what we call each of these numbered sections).

The latest change was in Exercises 21 the learning outcomes should NOT include the bullet: Split a dataset into train and test sets using train_test_split() function.

Please feel free to point out everything you can so we can improve this and produce the best product. Thanks :)

mgelbart commented 3 years ago

@kvarada I'll leave this to you, but let me know if there's anything you want me to look at.

kvarada commented 3 years ago

Sounds good. I'll get to this in the afternoon.

kvarada commented 3 years ago

Great job with the module @hfboyce ! I see a bit of "Hayley touch" :).

Here are my comments for improvement for module2.

0

1

2

3

4

5

6

7

8

9

10

11

14

15

18

20

Great work overall!

kvarada commented 3 years ago

Btw, it took me ~2 hours. Probably because it was my first time (and I am also a bit slow in general :)).

mgelbart commented 3 years ago

Btw, it took me ~2 hours. Probably because it was my first time (and I am also a bit slow in general :)).

Yeah I wasn't tracking my hours back then but I think it probably took me ~2 hours as well at the beginning, but then I got faster.

I suggest changing both questions. Although sklearn doesn't support categorical variables in decision trees, decision trees are capable of handling them. For example you could have a categorical variable "weather" with three possible values (e.g., warm, hot, and cold) and then there would be 3 children to a node and it would represent if/elif/else statement. What do you think @mgelbart?

My take is that you'd always have 2 children but you'd have split rules like "weather == warm?" rather than having to one-hot encode weather, which is a limitation of scikit-learn. I guess there's this new CatBoost which handles categorical variables directly with some sort of gradient boosted trees. So personally I'm OK with the second question here. For the first question, it's a bit unclear to me whether children means direct children or all descendants. I agree with @kvarada on removing the first one and I'm fine either keeping or deleting the second one.

kvarada commented 3 years ago

Oh I see what you mean. I was thinking about pictures such as this, which are not uncommon to represent decision trees.

Screen Shot 2020-10-08 at 3 14 57 PM

Source

mgelbart commented 3 years ago

Yeah makes sense. I guess under the hood they would probably be two separate splits, one for == Sunny and then, on the False branch, another for == Overcast. But I agree it's getting into technicalities which are besides the point of what we're teaching here.

hfboyce commented 3 years ago

So take both out or just the first one? I'll can replace them with something else.

mgelbart commented 3 years ago

I think the second one is keep-able if we remove the 2nd option (if/elif/else), to reduce confusion.

hfboyce commented 3 years ago
kvarada commented 3 years ago

@hfboyce I used Module 2 exercises in my class activities today. Seems like the students enjoyed them :tada:. A couple of things came up:

hfboyce commented 3 years ago

@kvarada

Many of them were not able to run code exercises; they were getting binder hub error.

This may have something to do either with their browser, network or where they are located geographically. Do you know what browser they were using or where they are living?

The autograding for 7.3 is probably wrong?

Yes Elijah has yet to make the tests for Module 2 so this autograding is 100% wrong.

In 11 we are showing them two trees: one with class in each node and other without class. It was confusing for the students.

I've updated this! Thank you!