-
It's a well-known fact that many convolutions can be thought of as a direct matrix multiplication(Im2Col and more subtle ideas). cuDNN white-paper directly states that NVIDIA developers use precisely …
-
## 🚀 Feature: SplitLogSoftmaxWithLoss
An approximation of Log Softmax based on [splitsoftmax](https://gist.github.com/alisafaya/785e431539917cfbaab23281b77699d9).
## Motivation
The cur…
-
Dear friends,
Could you please provide an example to show how place CPU data to GPU and inference with C API ?
It would be appreciate if you like give some API more examples, such as Creat…
-
Many (all?!!) the rnn-related raw cudnn calls in
https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/cudnn/Descriptors.h
https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/…
-
Thanks for your code!
I think use tf.stop_gradient() for both "mean_loc" and "sample_loc" causes the gradients of location network to be None. Here is the gradients information:
GlimpseNetwork/…
-
## 🚀 Feature
LSTM forget bias must be initialized to 1 or 2 for better training.
## Motivation
Please see:
https://pdfs.semanticscholar.org/1154/0131eae85b2e11d53df7f1360eeb6476e7f4.pdf
http:…
-
Amazing work! I had a practical question about the time it took to train these models on the setup you described in the article. Would you be able to share more? In addition, would this repository be …
-
Guys, this is a very general comment and FYI...
To some extent you guys seem to be viewing NestedTensor as a generic ragged-tensor data structure, similar to TensorFlow's RaggedTensor. I understan…
-
Hi,
I am looking at the PPO implementation, and I am curious about this part (actually many other implementations are using this workflow as well, so I am also curious to see if I miss anything)
…
-
Sorry to bother you again.
I used the tensorflow=1.2.1, python=3.6, run worker.py as your instructions, but it encountered an error.
**ValueError:** Trying to share variable tcm/word/fw/multi_rnn_ce…