Homework and Tutorial 8

Adding in the old (unfinished) version of homework 8 for the sake of opening a new PR. Will be adding all the files when updates are made to them.

Some concerns/comments:

Tutorial 8 will need a tutorial assignment question since it was previously the presentation on ethics but that has been moved to previous weeks (am happy to make a draft if you'd like it seems pretty fun).
Tutorial 8 just needs more content overall, a lot of it is covered in the homework and there isn't much to adapt from previous years' material
Good thing for now is that we can just focus on editing the homework and finishing that so we know what new things to add to the tutorial instead
I think the student version should be pushed after the tester is perfect so its just 1 change of removing the solutions and testing them on Markus (more efficient than last time)
I'm not a fan of the dance question in the old pset, BUT, I do like the data (since I contributed to it in that year lol) and the model making questions so I can turn that into what was done in the videos where predictions (y_hat) was plotted against actual y_values. It makes a good transition into the R^2 tutorial discussion (and RMSE discussion too perhaps).
Noticed that in your version of the lecture, there was more images and things related to RMSE. I think that would be perfect for tutorial, but I need your opinion on that first
Would you be open to adding a very small intro to splitting data into train-test sets in my tutorial as then we can use the same dance data and relate it to the homework. It becomes a thing where "you used plots in the homework but here is a much better way we will be learning about soon".

Added STA130_F21_songrecommendations.csv to the data folder.

Lots of detailed comments from me orienting myself to things here and responding to your comments, as usual; but, just work through them one by one as you've done (well) in the past :)

Homework/STA130_HW_7_tester.ipynb file is updated with the message "31,459 additions, 369 deletions not shown because the diff is too large. Please use a local Git client to view these changes." Before I follow those instructions, an you, roughly speaking, describe your intention with the updates you've made to the last tester file? [You might just catch me up on the process we were doing if I've lost track of that -- I wasn't necessarily expecting any additional changes on the old tester file, but perhaps I forgot and should have been?]
I'd like to see your draft/proposal for a Tutorial 8 Assignment. I'd expect I'd like your suggestion here.
- I will soon see how the homework looks; but, generally, I'd like a tutorial "lesson" to perhaps start with R^2=cor(y,y-hat) as you've suggested, then move on to RMSE (as my lectures have done) and adopt the code that got "skipped" in the week 7 tutorial that visually shows RMSE for simple linear regression and/but then extend it to multiple linear regression, and then move into the notions of overfitting and training/test analysis where we can show that R^2 can be made to be 1 (just as RMSE can be made to be 0) in a training data set, but that it is the test data set where we can observe generalization
- As mentioned, Cole's week 9 (and 10) are less concerned with questions of overfitting and generalization; so, this is content that I want us to pursue in week 8 to the best of our ability given the time/space limitations we'll have there.
- You might consider reviewing my other comments and our other exchanges again as well to make sure that my intentions above are lining up consistently with our progress and progression of development so far
- I am going to share the course project with you (soon)... it essentially amounts to model building variable selection exploration with a specific orientation towards examining and interpreting specific interactions... honestly, it might be worth trying to introduce everything above with the course project data exactly...
Do you have any draft of tutorial 8 at this point yet? It's not in the commits; but, no worries, as I think my comments above suggest that having a clean slate for tutorial 8 is perfect: we have plenty of material that we can put into the tutorial is far as I'm concerned.
Agreed -- let's see what the hw finalizes as; and, then, what the tutorial should be will likely become pretty clear
Agreed -- good plan
I like your orientation and idea here. I've tended to really like the proposals that you all have made. What happens when I get content ideas from everyone is, what I think is, this is good -- needs to be flushed out and content thickened so the reading is comprehensive and standalone and/or the tutorial is fully scripted and content full so the TAs don't need to improvise of come up with anything themselves; although, I'm happy for the TAs to "decide" how to use the tutorial material themselves... i.e., what they emphasize and dwell on and what they say "this is here, have a look closer if you want".
Agreed -- yes, exactly, that's always been my feeling about RMSE since I decided there wasn't room for what got made for Tutorial 7. That was because I decided starting with correlation and spending time on that in tutorial was important; but, it then meant that there wasn't time to go into the RMSE in week 7 tutorial. Especially since I decided I wanted to focus on interpretation material (and the indicator variables) and such
Yes, it has also become my plan/expectation that we need to motivate and introduce train/test in week 8 tutorial, because week 9 material seems to mostly just assume it and doesn't dwell tremendously on motivating it. This was some of my discussion in our previous exchanges (probably both in previous PR conversations, as well as perhaps slack DMs).
- I do not expect to introduce train/test in lecture, as I'll more be motivating model building, interactions, etc.
- It's good we've already introduced the idea of interactions in the week HW 7, and indicators a little bit there as well; and, indicators a little bit more extensively in the week 7 tutorial; so, I should be able to lecture on these things comfortably; and, it's just a "this is how you code this up". I likely will as well discuss R^2 and make some related comments about generalizability; but,
- I do think tutorial is the right place to go a little deeper on this; so,
- we are indeed looking to do a lot in tutorial 8... [leave model building/interpretability stuff up to me in lecture, and] focus on overfitting/generalization ideas for the tutorial
- In some sense it feels like "shouldn't this belong in decision trees / classification / (machine learning)?" But, actually, honestly, I think it's more of a "deep" / "understanding" type of concept that is relevant for multiple linear regression; and, so, tutorial indeed is the location where I'd like this kind of content presentation to go
- Cole's week 9 tutorial starts to dive into the ethics of FP/FN decision making; and, the week 9 homework goes into other things, such as "feature importances", which, I think is why the emphasis on train/test generalization hasn't really been the main point of week 9 materials, and/why/but has ended up feeling like it should belong in week 8.
- Also, it's okay/probably best if train/test generalization is not a part of the week 8 homework, actually... I think using the homework to have students practice just doing multiple linear regression, and model building, and maybe interpreting predictions is best... so hopefully that's what I'll next be seeing as next take a first look at the proposed homework!

One more comment to add to those above: model building with p-values is the usual statistical approach; so, this needs to be presented/contrasted with the RMSE train/test idea as well.

hw8_tester

flushed out preliminary markus/testing indications at top of file to the more recent versions
I wonder if it's a good idea to add in the project data or switch some of the current data out and make the questions instead for the project data... ?
Q1 extended hint slightly; and, added as a hidden cell
- Addressed SettingWithCopyWarning which will manifest later on if not addressed here
Q1 -> Q0 and broke it apart
Q2 extended hint and automated question
- looks like p-value was wrong misreading/misusng the p-value of the intercept?
Q3 test needed. Perhaps try a test based on confirming the correct number of low/med/high
- I agree with you that substantial help should be provided in this hint; and, maybe even another way to do this not based on np.select?
Q4: I've worked on this one; but, it seems things in this notebook are more of a rough draft than I'm looking for; so, have a look at the changes I've made and then see if you can finalize the remainder of the notebook up to that level.

I will move over to work on Matthew's PR and the course project so that I can share with you what that's going to be. That I think will inform your thinking about if we can include and use some of that in the homework and/or tutorial; and, anyway, how we could orient our materials here to help the students prepare and be ready for what's asked of them for the final project.

I actually like your orientation in Q5 but I moved that up as a hint/tutorial/explanation for Q4; so, I've gone ahead and done a little more and adjusted Q5 to (what I think is) a better, more interesting question

I do quite like how this homework is shaping up... I imagine you'll next move into interactions...

I'd like to see the homework address some higher order terms (self interactions) if possible; and, I'd like to as well see the homework make some different contrasts based on just specifying the exact indicators of interest, as opposed to just using the formula='total_pr ~ C(seller_rating_tier)' style approach of Q4/Q5; although, you'll see in my hint to Q5 how baseline and constrat choices could be specified in a simply hacky kind of way.

Thought it would be a good mental exercise to get a draft of the tutorial assignment question before going to bed. Here it is (also am still working on the tutorial slides):

As a first-year student exploring the vast amounts of opportunities university has to offer, you decide to join the basketball team (a friendly reminder to get involved in extracurriculars and events!). The coaches get to know you more and find out that you are studying statistics. Since the team is currently training for a provincial competition, the coaches have been collecting significant amounts of data and want to analyze the key factors influencing the team's performance. The coaches have a breadth of numerical data on shots, rebounds, assists, player experience, and player sleep. Also, they have categorical data on pre-game routines, off-court practice, health history, and player nutrition. They believe the more complicated model will allow them to fix all sorts of small issues in their team to help them perform at their best.

You explain how you have learned about multiple-linear regression and techniques on creating a reliable model. The coaches only know a little about simple-linear regression and are interested in learning your process in creating and selecting an appropriate model. Your task is to show an overview of this process, including the practical implications of your potential findings and what the coaches can do to support their players. You should write down some hypothetical equations, explain any transformations needed in the data, and the differences between simple and multiple-linear regression. Do not be afraid to use technical statistical terms, but be sure to explain their meaning in simple and understandable ways that would help non-statistical audience made sense of what you're taking about.

Q0 needs a # test_Q0
Minor edits Q1-Q7
I want Q8 latex formatting match Q4 as this formal notation is better than the informal alternative imo
Q8 needs an auto failing test with the usual notice of automatic failure not counting against the student
- Make sure the printout renders correctly in MarkUs -- backslashes need to be used carefully
- Please also add a little guidance/explanation as to how you made these equations
- Actually... can you make Q8 an autotest by asking for the intercept and slope for each of the groups?
- The hint can be based on my comments above and what you already have
Added # test_Q9: makes sure these are present; otherwise, MarkUs doesn't pick them up and show them!
- Actually, can you make this an ABCD question with options that discuss the evidence against the null hypothesis that indicates that there are likely indeed differences between these slopes (with the "high" slope not 0 and "medium" being weak evidence of a difference from "high" slope, etc.) and I think there could be some good distractors based on what the model as specified does not give evidence for (such as "low"/"medium" being different from 0... there's no p-value attached to this question under the current specification)
Q10: I like this question, but I've changed it around a bit: please make this an ABCD autotest question
- again I'm going with formal indicator notation
- In your multiple choice options please put hats on the betas!
- For other ABCD options use the y-hat_new and y-hat_used you already have (without the y-hat_new and y-hat_used of course) and for the other options make up some weird interactions that aren't in the specification
- the following was not actually how you specified the model ~cond takes values "new" or 1 if the game is new and "used" or 0 if the game is used.~ so I've fixed that to match your provided solution
Q11 is a good follow up for Q10 and should just have the students answering explicitly the point you're trying to make with this question
- In your ABCD options for Q11 see if you can include some comments and distractors related to how this simplifying assumption may (or may not, as a distractor) actually be better since it defines a simpler (more parsimonious) model that could indeed be effective
Your Q12 follow up question is a great opportunity to demonstrate how to check if a more complex model is indicated/suggested/supported by the data... oh, I'm gonna make this happen in Q13->Q14
- Please make Q12 a multiple choice question (and use hats on betas again as is appropriate here!)
Q13: please make this the autotested multiple choice question as I've indicated with my edits

I'm pausing here to comment that this whole sequence is outstanding. This is exactly the way I want these homework assignments to go... this really helps guide the students through the use and concepts of things here... just really fantastic

Q14: I've changed the orientation here as I hint/indicate I'm planning to do above
- please create the necessary autotest to support this question on the basis of "evidence against the null hypothesis" in the manner of Q2/Q5
- the interaction is default in plotly; but, this question is now updated to help the students practice model selection
- I'm hoping for some good model selection guidance based on p-values in the Spotify questions!
- don't forget, you'll need # test_Q14 or whatever you create won't show up in MarkUs
- This is not quite Simpson's paradox: that's when the slope ignoring subgroups is, say, positive, but the slope is then negative within each subgroup: https://stats.stackexchange.com/questions/478463/examples-of-simpsons-paradox-being-resolved-by-choosing-the-aggregate-data
- I've removed the comment about this; but, I'll let you think it over a bit

continuing

Q15: looks good minor edits
- # test_Q15 needed for MarkUs visibility/processing
Q16 needs # test_Q16 and please make this an actual test that checks that the update has been made based on ...=="minor").sum or something like that
- Nice touch adding the redone plot with no scale (which I assume works -- I can't see it at the moment as I don't yet have the data)
Q17 needs # test_Q17 and make this an autotest by checking the intercept and slope coefficients
Q18 -- I like this question, but I don't think it's really clear what you're asking the students to do
- make this more hand-holdy and walk them through step by step what you're trying to get them to do (instead of referring them to the lecture recording)
- I don't think just saying "Use plotly.graph_objects..." is enough to get students to know you're trying to ask them to make a y=x line...
- There is an r^2 printout but now no code for it? I think I would give the students the figure code, and then have them compute the R^2 or read it from the .summary()?
- I would also maybe have them compute residuals and then make a histogram of those?
- I like what you're wanting to have the students do here, but I'm trying to think about what question you can automate here...
- I think this question needs to be defined/refined/polished a little more.
- I think the question here should just autocheck if they got the residuals calculated right; but, the problem as a whole should point out the R^2, and the distribution of the residuals, and the original y v y-hat plot you initially envisioned.
- something like that...
Q19: formal indicator notation, and some other notational edits
- make an autotest based on entering/confirming model fit coefficients
- needs # test_Q19...

I'm liking where this all seems to be going, but/and, I have a couple comments of what I'm hoping/expecting to see:

using p-value significance/evidence to evaluate models
- might need to add a question to do this: energy could be removed in Q19...
- Could just make this part of one single autotest for Q19?
dovetailing back to Q18 to see the y v y-hat, R^2, residuals, stuff still holds and can still be used in the same way as a simple linear regression model (which I think will be excellent to demonstrate and reinforce in this manner)
Q20: great -- so now this needs to be rebuilt to mirror Q18. Help walk the students through what you're trying to get them to notice and understand in this multivariate context (about y v y-hat, R^2, residuals, etc. whatever you're thinking is a good for them to understand/consider).

Do we/Can we add some model assumption checks?

normality of residuals
residuals versus y for homoskedasticity
we can skip considerations of independence and linear form (and fixed x's)

Continuing...

Q21/Q22: while I generally (quite very much) like what you're doing here, what's interesting in this (model3) context is that mode is not significant (without being used as an interaction); so, these two questions should be re-oriented around/towards emphasizing that...
- so one of the questions should ask for "parallel" lines in the way you have
- and the followup question should show that it's no good (and perhaps/probably refer back to the image that visually suggests this(?))
Q23/Q24: again we're missing significance for these coefficients (which is interesting... relative to the fuller model)
- Can these questions be used to emphasize this? That the significance only comes from the more complex model?
- Is it possible that we can observe assumption violations as a result of using this (and perhaps the previous) under/mis-specified linear forms? That would be pretty interesting/awesome as a problem for the students...
Q25 I like idea here; but, I think it may need to be reformulated a little bit to account for the fact that some of the models are not very compelling in terms of statistical significance...
- But this is a great question and a great use of the R^2 content that is being built up through the template you've been structure your questions under

Can we add a small little segment that discusses that an observation is a row which can obviously be multivariate and have many measurements?

... this is something that could be introduced straightaway with data frames (but I don't think I thought to do this); but, I don't think it's necessarily that relevant at that point in time; whereas, it becomes relevant in linear regression; and/but, I think it's okay if we wait until multiple linear regression as opposed to simple linear regression to introduce this idea...

pointOfive / STA130_F23

Homework and Tutorial 8 #19

continuing