Questions for Midterm01

skywang0407 commented 4 years ago

Hi Professor McGowan, Here listed some questions I have for review (especially the last one):

Can you please explain about "well separated" and why is logistic regression unstable in this situation?
Do we need to remember this graph?
Can you please explain which pair of points show the best model/flexibility?
Can you explain why as flexibility increases, bias decreases and variance increases?
Do we need to remember these equations? And do the two equations work for the same data, or one for the training data and one for the test data?
What does the dashed line mean here? And do we draw the dashed line at the bottom of each curve?
Does the Bayes Classifier mean the Bayes decision boundary, and what is the relationship between these two concepts and the truth?
Will we be tested on the topics below that are not covered in class?
Can you please explain what is the marginal default rate? And why is the dashed StudentYes line is above the dashed StudentNo line while the StudentYes curve is below the StudentNo curve?
Will we be tested on the multinomial logistic regression? And can you explain how do we calculate the denominator here, since I do not understand how does L influence the beta values.
What does sensitivity here, just curious.
I know we use this equation on class, but I just want to make sure that this is the process we use to determine the decision boundary (find X) when K=2, right? What if K>2, is it the same that we equate the three classes?
How do we explain this plot?
Do we need to know these concepts (i.e. hyperplane etc.)?
Why is it sensible? Is it because we use a different data set to test?
What does c mean in the equation? If it means class, how can it be multiplied by x? And do we need to know concepts of two learning methods?
Can you please explain again why it biased upward and why does the variance get higher? I am still not clear about the concept of bias-variance trade-off.

Thank you very much!!!

LucyMcGowan commented 4 years ago

Can you please explain about "well separated" and why is logistic regression unstable in this situation?

well separated means that the predictors separate into the classes VERY well, which sounds like a good thing (and generally is!) but the maximum likelihood estimation procedure has trouble in these cases. The parameter estimates can become infinite

Do we need to remember this graph?

no

Can you please explain which pair of points show the best model/flexibility?

the blue points are ideal

Can you explain why as flexibility increases, bias decreases and variance increases?

Here is a blog post that tries to explain this

Do we need to remember these equations? And do the two equations work for the same data, or one for the training data and one for the test data?

which equations?

What does the dashed line mean here? And do we draw the dashed line at the bottom of each curve?

the vertical dashed line shows the optimal flexibility based on the test error.

Does the Bayes Classifier mean the Bayes decision boundary, and what is the relationship between these two concepts and the truth?

the Bayes classifier will give the minimum error rate (Assuming the probabilities are known)

Will we be tested on the topics below that are not covered in class?

no

Can you please explain what is the marginal default rate? And why is the dashed StudentYes line is above the dashed StudentNo line while the StudentYes curve is below the StudentNo curve?

marginal is the effect without any other variables in the model, the conditional effect is the effect from the model that includes additional variables. The marginal (dashes) effect is different than the conditional curve due to Simpson’s paradox.

Will we be tested on the multinomial logistic regression? And can you explain how do we calculate the denominator here, since I do not understand how does L influence the beta values.

you will not be tested on this

What does sensitivity here, just curious.

sensitivity is another word for true positive rate

I know we use this equation on class, but I just want to make sure that this is the process we use to determine the decision boundary (find X) when K=2, right? What if K>2, is it the same that we equate the three classes?

yes, except here the priors are equal, that may not always be the case. You won’t have to solve by hand when K > 2

How do we explain this plot?

this plot helps visualize the data in lower dimensions (that optimally discriminate between groups)

Do we need to know these concepts (i.e. hyperplane etc.)?

you need to generally understand that LDA is a data reduction method

Why is it sensible? Is it because we use a different data set to test?

yes

What does c mean in the equation? If it means class, how can it be multiplied by x? And do we need to know concepts of two learning methods?

here the cs are constants

Can you please explain again why it biased upward and why does the variance get higher? I am still not clear about the concept of bias-variance trade-off.

check out the above blog post for bias-variance trade-off

@skywang0407 I’ve added responses 👍

skywang0407 commented 4 years ago

Thank you very much!!! I am sorry I ask too much.

sta-363-s20 / community

Questions for Midterm01 #43