-
**Summary**
I'm hitting a NaN loss issue when I use the TransformerLayer in place of a pytorch transformer layer I wrote.
**Details**
I'm using the nvcr.io/nvidia/pytorch:24.04-py3 docker cont…
-
### Applies To
- [X] Notebooks (.ipynb files)
- [ ] Interactive Window and\/or Cell Scripts (.py files with \#%% markers)
### What happened?
I'm facing is the same issue reported in https://github.…
-
@HongtaoYang , I am very grateful for your source code! However, I have found that your implementation is very sensitive to the parameters of the network, such as :
- In the batch_normalization lay…
-
The paper: [Deep Speech 2](http://arxiv.org/abs/1512.02595) used _sequence-wise_ normalization for recurrent computation which was proved to substantially improves final generalization error while gre…
-
### What?
- [CS231n: Deep Learning for Computer Vision](http://cs231n.stanford.edu/) assignments self-study
-
Hi! Thanks for sharing your code. After I read your paper, I found this idea is very interesting and it is a little like SN(switchable normalization).
I have a little question about your paper and im…
lxtGH updated
4 years ago
-
```
from tensorflow.keras import backend as K
from tensorflow.keras.models import model_from_json
import tensorflow as tf
import keras2onnx
sess = tf.Session()
K.set_session(sess)
K.set_learn…
-
### System Info
GPU: H20 server
CUDA Version: 12.5
Driver: 555.42.02
TRTLLM Commit: 2d234357c6e69fa514f6e9b4d4a5ad3bc431c4a6
built from source on linux
### Who can help?
_No response_
…
-
I plan to use a custom trained model in a local environment without network access.
What's the best way to inference saved model -via
`model = torch.hub.load(...)` or
`model = attempt_load('…
-
Hello, I read your paper, and get a little a little confused about Figure 3 in it. What does the y axis (1,2,3,4,5) of Figure 3 refer to? After Domain transform,the norm of the C-channels feature is n…