-
I am working on a project where I sample a set of n-dimensional points from a Gaussian distribution (of learnt parameters) as follows and then evaluate those points based on a loss function to update …
-
Memory allocations and release will probably become a bottleneck during the forward and backward propagation.
During the forward pass it will hold inputs tensor in cache. During backward pass it wi…
-
The attacker API here can be made more simple, imho, if the adversarial training were implemented in `tf.keras.Model.train_step`. That function is called by `fit()`on every batch of data.
By using …
-
# Deep learning is EXPENSIVE
e.g.
train ResNet50 with ImageNet dataset for 80 epochs
80 * 1.3M images * 7.7B ops per img
# Solution?
- **Data Parallelism (large batch training)**
![image](http…
-
是我哪里弄错了吗?还是说就是要这么大的显存?
-
@MikeInnes I have a very simple model that does not train on Flux#master due to NaNs from exploding gradients. However the exact same code works and trains as expected with `Zygote.pullback -> Tracker…
-
**Project Description:** Min-max fairness is a natural and desirable notion of subgroup fairness. The goal of this project is to develop open source implementations of recent [research](https://arxiv.…
-
import input_data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
import tenso…
-
In the pytorch implementation of kfac, G1_ is computed as:
G1_ = 1/m * a1.grad.t() @ a1.grad
However, the a1.grad is different from the a_1 in (1) of kfac's paper. Specifically, when you do back…
-
![image](https://user-images.githubusercontent.com/23263731/200564311-220f4f34-3d31-4eb5-a5d1-6af2683cab5a.png)
![image](https://user-images.githubusercontent.com/23263731/200296378-68401e7c-…