kcarnold / cs344-exam-23sp

1 stars 24 forks source link

u02-sa-learning-rate SLO and question scope #34

Open MatthewWalstra opened 1 year ago

MatthewWalstra commented 1 year ago

@kangk9908 https://github.com/kcarnold/cs344-exam-23sp/blob/6a5024bc438f6db811ce74682ee7ba1fc4684112/u02-sa-learning-rate/SLO.md?plain=1#L1

From how I read the SLO from unit 2, I think it more relates to the input data and labels for that data, not the parameters used to train the model. I think this SLO from Unit 4 is more applicable, but not completely applicable. @kcarnold I might have missed it, but I didn't see a SLO for learning rates. Because HW3 and u5n03 (+ a few other homeworks and labs) refer to gaining an intuition for learning rates, I think it should be it's own SLO. However, that's a topic for a different issue.

Unit 4 Describe the overall approach of Stochastic Gradient Descent: how does it use information from a batch of data to improve its performance on that and other data?

I think asking two questions in one is too broad, and it would be more clear if the question only focused on one thing like a specific case of a high, low, or ideal learning rate. If you use the the unit 4 SLO, make sure the question directly relates to learning rates and Stochastic Gradient Descent. EX.

  1. What does learning rate do in Stochastic Gradient Descent? Step size in the direction of the negative gradient, so loss decreases
  2. When using Stochastic Gradient Descent, why is the average loss of a high learning rate greater than the loss with an ideal learning rate? Too high of a learning rate causes the parameters to change too much, so the parameters overfit to a given batch of samples rather than learning to generalize. This causes higher average loss because batch overfitting leads to greater loss on each subsequent batch and a higher overall loss. If the learning rate is too high, the loss will oscillate without converging.
MatthewWalstra commented 1 year ago

Clarifying Review

  1. Check if another SLO is more applicable to learning rate (see example above). Otherwise, leave it because learning rate is an important topic.
  2. Consider limiting the scope of the question. For example, importance of learning rate as it relates to training time or minimizing loss.
kangk9908 commented 1 year ago

Thanks for the extensive review and made me realize that I should focus more closely to what the SLOs are asking for. It was just a question that I had personally and thought it was a good exam question.

kcarnold commented 1 year ago

There should be a SLO on learning rates; if there isn't one, that's a bug with the course materials.