Open hfboyce opened 3 years ago
make_scorer
?model.intercept_
. Ridge
is not a classifierlr.coef_[0]
instead of lr.coef_
lr.classes_
so we can see the order, i.e. that Canada is negative and US is positive in the model's brain π§ CountVectorizer
was used and so each column is a word, and since each coef is a column we now have one coef per word? and also the fact that the targets are positive or negative sentiment here. Maybe that would be worth adding as a last slide on the previous slide deck so there could be video footage introducing this? or is text enough in 10? I was also caught a bit off guard by the text itself. Maybe you could say "We have the following text, which we wish to classify as either positive or negative". when this is done let's touch base again so i can check that the context is sufficient.classes_
I mentoned earlier. Will review 16-20 soon
pkm_grid
even thought it's random search? max_iter=1000
in the log reg. May be worth having them do this. draft_round
will increase the probability of Guard, but it may or may not increase the probability of Forward!! Because in fact if draft_round
is huge then the probability will go to 99.9999 for Guard and the probability for Forward will be decreasing. So we have to be quite careful here with our interpretations. I will also add this to my CPSC 330 course. So, TLDR here, this wording should hopefully be safe: "For which feature does increasing its value always push the prediction away from the Other class?"1.5: it would have been nice to also show an example with lots of features where alpha=0 is not the best choice - maybe one of the previous regression datasets they've seen already?
Ok this was a problem. I tried a bunch of different datasets using a single feature but the problem I was encountering was that it did not plot very nicely. So I guess my question here is what is more important, a plot that is easy to understand or a not lowest alpha value. Of this hyperparameter tuning and the logistic regression one, it's easier to change this one ( since we use the dataset for logistic regression for quite a few slide decks, but would you be ok sacrificing the plot for it?
4: have they already seen make_scorer?
Yep! I talk about it in Module 7!
5.8: I'm hesitant about this because the features are not scaled. remove this slide? UPDATE: this shows up in 12.1. Ok so maybe we need to keep this in but add a cautionary note that it depends on the scaling of the features, because larger features will have smaller coefficients, but if we scale then they are kind of on a level playing field.
I added this in the transcript notes of the same slide, is that ok?
5.9: let's just call them weights.
Wait wait, I thought we were calling them coefficients?
any chance someone will get confused and think 9.172344e+06 is not the same as 9172344.01129167?
We explained this in the first module of PPDS (a practice problem)
9.9: again, kind of sad that the lowest regularization model does best. let's see if we can get a good example for at least one of the two cases (ridge or logisticregression) where it isn't the case?
See 1.5 above. This was also a bit of a mess. I found it had finding a good way of showing this the values actually changing somewhat reasonably.
15: have we defined "positive class" explicitly? maybe this can be mentioned when we add in the classes_ I mentoned earlier.
We did heavily in Module 7!
17.9 x-axis label cut off
But I have so much room at the bottom of mine π How big is your screen? This is mine at 100%.
Ok, Majority of the changes are done. Just need to figure out Question 10 and adding a slide in deck 9. I will do that in the morning. In the mean time I can pass the exercises to Elijah to make tests for so I can review them before Friday.
Ok this was a problem. I tried a bunch of different datasets using a single feature but the problem I was encountering was that it did not plot very nicely. So I guess my question here is what is more important, a plot that is easy to understand or a not lowest alpha value. Of this hyperparameter tuning and the logistic regression one, it's easier to change this one ( since we use the dataset for logistic regression for quite a few slide decks, but would you be ok sacrificing the plot for it?
Yeah, I don't think a single feature would work - basically any dataset with a single feature will probably prefer alpha=0 as the best. Maybe here we should consider deviating from our rule of fewer datasets and have separate datasets for the plot vs. the alpha tuning?
Yeah, it's fine to do it here and leave the logistic regression alone.
I added this in the transcript notes of the same slide, is that ok?
Yeah sounds good.
Wait wait, I thought we were calling them coefficients?
I have no idea what I was talking about. Please disregard.
But I have so much room at the bottom of mine π How big is your screen? This is mine at 100%.
I think it's that I have a very high-resolution screen. Anyway this is what I see. There's a lot of whitespace at the top above the picture that could be removed. Sometimes with matplotlib calling plt.tight_layout()
before saving helps with this - not sure if that would help here.
@mgelbart Here we go!
First round coming in HOT π₯ ! As I said, this is one of my least happy modules and it feels a little choppy to me.
I am ready for some feedback to fix it up.
You can find it here -> https://ml-learn.mds.ubc.ca/en/module8
Assignment coming tomorrow/monday.