-
@kangk9908
https://github.com/kcarnold/cs344-exam-23sp/blob/6a5024bc438f6db811ce74682ee7ba1fc4684112/u02-sa-learning-rate/SLO.md?plain=1#L1
From how I read the SLO from unit 2, I think it more rel…
-
Atm we use "strawberry" Gradient Descent Method on the error surface given by the respective error function.
The Question now is: Which Method is best to use?
A Candidate is **Stochastic Gradient …
-
I may be missing something but it looks as if you're doing Full Gradient Descent (i.e. using the entire batch) in the SGD class. SGD should just use a single value selected at random.
-
The basic idea is to represent the joint state-action value function as a Gaussian process. The optimal policy can be approximated with a few steps of gradient descent on the action subspace, holding …
-
-
Great implementation from the paper!
The paper used mini-batch gradient descent with batch_size of 10. But I can't seem to find it in the training step. It seems that you are trying 1 obs at a time…
-
Basically these are parameters that aren't updated via gradient descent (but would be serialized - a good example that already exists here is the running mean or running variance in batch normalisatio…
-
On contour plot, probably. Sampling far away from minimum should show most minibatch gradients pointing in the right-ish direction. While closer to the minimum, not as good. Can do this with varyi…
-
Hi all,
I am currently trying to understand how to do semi-supervised learning with gpytorch based on this paper: https://arxiv.org/pdf/1805.10407.pdf
I would like to setup a mini-batch approach…
-
I have been using your code (thank you very much for the nice repo), and I wonder if line 85 of som.py should be removed. The line in question simply has:
`delta.div_(batch_size)`
But I suspect …