gradient-activation Search Results

1000+ results
for gradient-activation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

apple/tensorflow_macos #23

Certain kind of layers breaks: `No MLCTrainingGraph has been…

Adding `tf.keras.layers.Dropout` to model results the **following error**: ``` tensorflow.python.framework.errors_impl.AbortedError: Compute: Operation received an exception: Compute: No MLCTraining…

Willian-Zhang updated 3 years ago
3
quic/aimet #1846

AttributeError: module 'de3ad3acc172d510cb421b511fafb14a' ha…

When trying out the keras autoquant notebook, the error as shown in title appears. It seems to be an issue related to the quantization op library Full Error message: AttributeError: in user code: …

hiraymond updated 1 year ago
1
microsoft/DeepSpeed #2103

Activation Checkpointing conflicts with Weight Sharing

**Describe the bug** I implement multiple transformer layers with only one-layer parameter (e.g., recursively use one layer six times to construct a 6-layer transformer), when I use activation checkp…

iyupan updated 2 years ago
1
shelfwise/receptivefield #14

Using pretrained models/forward hooks in pytorch gives grad …

I was attempting to view the feature maps of a pretrained VGG model in PyTorch. Instead of saving the features in the `forward` method of the model, I registered a forward hook with the layer(s) wher…

ahgamut updated 4 years ago
7
SCIInstitute/FwdInvToolkit #36

Calculate Activation Time module

This is an idea for a module to calculate the activation time in a few different ways including min dv/dt, max gradient, and Matthijs' method. Some thought needs to be put into the methods that are t…

jessdtate updated 7 years ago
1
ismailuddin/gradcam-tensorflow-2 #2

Logits or post-softmax?

Hi @ismailuddin I looked at your implementation of Grad-CAM and it seems to me that the heapmaps are calculated using gradients of post-sofmax outputs rather than logits (pre-softmax). The last lay…

mlerma54 updated 1 year ago
2
pyg-team/pytorch_geometric #2020

Decaying gradient issue

Hi Matthias, I am using GCNConv to solve a prediction task problem with linear layers at the output of the GNN. The model is trained on graphs of 10K nodes with ~20K-40K edges. The gradient value d…

Weinmarr updated 3 years ago
3
microsoft/DeepSpeed #6351

[BUG] `reduce_bucket_size` influences training convergence o…

**Describe the bug** I launch deepspeed training for a 600M parameter diffusion model, and only vary `reduce_bucket_size`. I tried the following values: - `reduce_bucket_size: 500_000_000` — conve…

universome updated 13 hours ago
16
1Konny/gradcam_plus_plus-pytorch #2

wrong implementation of denominator in eq.19

Dear @1Konny, Thanks for your implementation! I have detected that line 168 in `gradcam.py`: `alpha_denom = gradients.pow(2).mul(2) + \ activations.mul(gradients.pow(3)).view(b…

pherrusa7 updated 3 years ago
3
SkalskiP/ILearnDeepLearning.py #29

Numpy deep neural network

Thank you for this wonderful example, which helped me understanding the gradient descent implementation. I just noticed a minor mistake: - dW_curr = np.dot(dZ_curr, A_prev.T) / m - db_curr = np…

marxav updated 4 years ago
1

上一页 1...7 8 9 10 11 12 13...100 下一页

1000+ results for gradient-activation

1000+ results
for gradient-activation