-
before i use your loss function, 2.5sec/step
after i use your loss function, 32.0sec/step
i use tensorflow 1.6.0
-
There exist numerical instability in **objectives.categorical_crossentropy** function, which cause gradient vanishing right after the first training batch.
I suggest adding $\epsilon$ to prevent the …
-
## Background
Most d2l chapters follows the data/network/loss/training/prediction structure. Codes are reused by saving into the d2l package. However, APIs are not consistent and some functions hav…
-
Please go to Stack Overflow for help and support:
http://stackoverflow.com/questions/tagged/tensorflow
Also, please understand that many of the models included in this repository are experimenta…
-
**Describe the bug**
Training BERT using Keras NLP is significantly slower due to the `keras.layers.Embedding` not being XLA compatible by default on TensorFlow GPU. This is similar to an issue repor…
-
I couldn't find the implementations of modReLU, CReLU or zReLU in this code. Has anyone coded/found them? Thanks.
-
I want to simply make a RNN model like below. many2many (one output for each input)
```
def create_model(inp):
with C.layers.default_options(initial_state = 0.1):
m = Embedding(300)(…
-
Hey there!
I came across your project from Jeremy Howard's Twitter. I think it's great to be benchmarking these numbers and keeping them in a single place!
I've tried running your script and ran…
-
`models.py` imports `Embedding` on line 4, but it never seems to be used. Instead, the code to get word embeddings is this:
```python
wemb = TimeDistributed(Dense(output_dim=self.embed_size,
…
-
Preconditions (for shape-checking, valid value ranges, etc) would be valuable for better source locations in error messages and to avoid uninformative TensorFlow runtime errors.
Example: `matmul(_:…