-
我仿照STDP学习教程中,与梯度下降混合使用的部分,构造vgg识别cifar10dvs。
我检查了每个环节的显存,发现是每次迭代的过程中,显存都在执行模型推理的时候剧增,但是没有被释放,一直累积到以后的每次迭代,现在只能跑几次迭代就会爆掉,都没法跑完一个epoch。
请问这种情况是正常的吗,还是可能在哪里的配置有问题呢?
我仿照一篇论文搭的vgg7,学习的部分就和教程里一样的。
```…
-
According to this blogpost: http://www.fast.ai/2018/07/02/adam-weight-decay/ and mentioned article https://arxiv.org/abs/1711.05101, Adam has problems when used with L2 regularization. If i understand…
-
Dear all,
I am a big fan of this library! Unfortunately, I am missing support for adversarial attacks on regression models.
It seems that you already support regression:
https://adversarial-robu…
-
# [Traing](https://pcc.cs.byu.edu/2017/10/02/practical-advice-for-building-deep-neural-networks/)
- **Use the ADAM optimizer.** It works really well. Prefer it to more traditional optimizers such as …
-
### Expected behavior
We are simulating QAOA on an NVIDIA DGX system.
Since the new pennylane version (v0.22) supports cuQuantum using the "lightning.gpu" device, we want to use it for potential sp…
-
Hey!
I was reading through the code and I noticed that you're using element-wise exponential matrix here:
https://github.com/google/neural-tangents/blob/5f286b7696364217aa4a2d92378aabd0203a791e/n…
-
Hi, can you provide some info about the optimizer (ADAM, RMSProp, etc.) used in YOLOv3? Also, where can I find the optimizer and the loss function in the source code? Thank you.
gnoya updated
4 years ago
-
Replace the gradient descent chapter with a more general optimization one.
To be included:
- [x] Mathematical definition of the optimization problem
- Different terminology (minimization versus…
-
is there any more effecitve method than starting form a learing rate of 0.1? if there is, why?
besides learing rate, how to weight_decay, momentum, etc.
-
## 📚 Documentation
The README https://github.com/pytorch/examples/blob/main/imagenet/README.md is very helpful when getting started with training AlexNet.
We are able to successfully train AlexN…