-
thanks for the implementation of nfnets in keras!
I want to run it using optimizer(SGD_AGC).
But there's an error like this.
```
TypeError: Can't instantiate abstract class SGD_AGC with abstra…
-
소스코드:import os
import math
import argparse
import nsml
import torch
import torch.nn as nn
import torchvision.models as models
from data_loader import feed_infer
from data_local_loader impo…
-
-
Hi --
I'm wondering about the implementation of SGD w/ momentum in `autograd.optimizers.sgd`:
```
velocity = momentum * velocity - (1.0 - momentum) * g
x = x + learning_rate * velocity
```
O…
-
-
- GD需要用全部的样本来更新参数,计算代价高
- SGD容易在最小值处拔振荡(oscillating),需要优化learning rate的调整策略
- Batch Size取多大是一个问题
- 在现代GPU架构中,大的BatchSize有利于数据的并行,大的Batch Size一般需要配大的Learning Rate
- 大的BatchSize在训练上精度一般为如小的BatchSize…
-
Kathy preferred the SGD menus
We will
i) look at PomBase and SGD menus side -by -side and compare contents to assess the difference and see where improvements can be made
ii) @manulera suggested …
-
We want to be able to do completely random mini-batch stochastic gradient descent ( or maybe other flavours...)
We could consider something like:
https://github.com/epapoutsellis/StochasticCIL/bl…
-
asynchronous SGD also has many options, 1 option which looks promising now is a ring approach
-
Are all seeds handed down correctly? Results can apparently vary with passing test when using get_benchmark(seed)