-
I see your code is running through the whole dataset for each training iteration. For many applications, it is quicker to split them up in random smaller batches and run gradient descent on each "mini…
-
Most of the examples use very small batch sizes, which leads to a lot of tiny GPU kernels and poor GPU utilization. Increasing it can help a lot. For example, on the Tox21 example I find that increa…
-
I am performing batch gradient descent, where the gradients are averaged across all training examples to move forward in a single training epoch. This can easily be parallelized by splitting the origi…
-
The examples in the [slim README.md](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim) give basic documentation for training and evaluating models when used separately; how…
-
My training loss fluctuates largely. but it decreases overall.
So i tried to make a low learning rate, but the same result.
Training displaying step is not the small either.
What do you think about it…
-
Hello Davis,
I'm testing the new dnn face detector on my images and I noticed for some batch size it reports:
`Error while calling cudaMalloc(&backward_data_workspace, backward_data_workspace_si…
-
Separate one global parameter server from master and update gradients in a setting minibatch.
This issue is about the solution, not the implementation.
-
In each epoch so far Shifu will do batch gradient descent, to support mini batch which can be used to improve the convergence.
Set a parameter like miniBatchRate.
-
Hi
Thanks for your great work
I am trying to understand, how to use it,
it would have been useful to have a sample of some image and there corresponding csv file with labels, so can reproduce the f…
-
Hi pescadores,
first of all: thanks for this nice package!
I have used it now for some DNNs and it seems to work. However, two questions remain:
* Can you point me to some literature where th…