-
7/8 Optimization 방법론
- Optimization 방법론의 발전
- Gradient Descent Algorithm
- 어떠한 함수의 최소점을 찾는것
- 함수의 공간은 파라미터, 파라미터 갯수가 엄청나게 늘어나면 함수 형태 파악 불가능
- 파라미터의 기울기만을 알고 있다고 가정(코스트 함수를 최소화 하기 위해, 코스…
-
# BGD SGD MBGD
## Reference
- [随机梯度下降法,批量梯度下降法和小批量梯度下降法以及代码实现](https://blog.csdn.net/LoseInVain/article/details/78243051)
- [优化器(Optimizer)介绍](https://blog.csdn.net/weixin_41417982/article/de…
-
The [`DenseVariational`](https://www.tensorflow.org/probability/api_docs/python/tfp/layers/DenseVariational) provides the `kl_weight` parameter, whose documentation is
> **`kl_weight`**: Amount by …
-
Explanation of Cost and Loss function:
https://medium.com/@vinodhb95/what-is-loss-in-neural-nets-is-cost-function-and-loss-function-are-same-ef069a570e95
Explanation of Linear regression regarding…
-
We want to be able to do completely random mini-batch stochastic gradient descent ( or maybe other flavours...)
We could consider something like:
https://github.com/epapoutsellis/StochasticCIL/bl…
-
Hi @zhangyuc:
I have seen your paper about splash. I want a question:
Is the experiment "in Local solutions with unit-weight data" the same with spark mllib currently imp? BTW,According to my exp…
-
In the pytorch implementation of kfac, G1_ is computed as:
G1_ = 1/m * a1.grad.t() @ a1.grad
However, the a1.grad is different from the a_1 in (1) of kfac's paper. Specifically, when you do back…
-
As discussed with @siddharthteotia, consider adding some common statistical analysis methods SQL language.
Few examples:
1. Pearson's coefficient
2. Sampling (bernoulli/stratified)
5. Histogram…
-
Hi,
Thanks for your idea and work. I would like to figure out how did you deal with unmatched sample size, in other words, two batch sizes are not the same.
I notice that in the MMD loss you used…
-
Hi,
I used "imagenet-resnet-152-dag" for fine-tuning to my dataset. However, I got the error as below. I am using two 6G gpus.
_Caused by:
Error using dagnn.Sum/forward (line 15)
Out o…