norm Search Results - Githubissues

1000+ results
for norm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/torchtitan #596

Gradient norm clipping with pipeline parallelism (PP)

Dear torchtitan team, I have a question regarding gradient norm clipping when using pipeline parallelism (PP) potentially combined with `FSDP/DP/TP`. For simplicity, let's assume each process/GPU h…

zijian-hu updated 1 day ago
9
NVIDIA/TransformerEngine #1193

FP8 for norm inputs and residuals?

Question for you guys: as best I can tell, there is no support at present for keeping activations in fp8 between the "output" matmul (of either an attention block or MLP block) and the next norm (laye…

cbcase updated 3 weeks ago
1
w-okada/voice-changer #1382

[ISSUE for v2]:

### Voice Changer Version vcc client ### Operational System win 11 ### GPU 3050 ### CUDA Version new ### Read carefully and check the options - [X] If you use win_cuda_torch_cuda edition, set…

aamirxs updated 1 week ago
2
jishengpeng/WavTokenizer #38

why grad norm is so high？

s ![微信截图_20240923214125](https://github.com/user-attachments/assets/d0ba934e-e018-49cf-ac2d-92b146506b29)

necrophagists updated 1 month ago
1
danbraunai/simple_stories_train #6

Verify RMS norm

[Huggingface RMSNorm](https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L111) and Torch RMSNorm give slightly different values (=0.0029 on one input…

danbraunai updated 2 months ago
3
gyoto/Gyoto #21

How to solve the severe :"norm is drifting - with norm, norm…

Dear author I'm sorry to bother you, but I have a problem that is very confusing to me. When I used my metric to draw my black hole shadow, the black hole shadow is based on the thin disk model, I fo…

Dirac666666 updated 1 month ago
5
tensorflow/tensorflow #54414

`tf.clip_by_norm` gives WRONG results when given negative `n…

**System information** - Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Y - OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04 - …

ArrowIntoTheSky updated 3 days ago
3
eloimoliner/CQTdiff #3

reconstruction guidance mismatch with the paper

``` norm=torch.linalg.norm(y-den_rec,dim=dim,` ord=2) rec_grads=torch.autograd.grad(outputs=norm, inputs=x) rec_grads=rec_grads[0] normguide=torch.linalg.norm(rec_grads)/x.shape[-1]**0.5 #n…

XZWY updated 5 days ago
3
baaivision/Emu3 #19

No QK Norm? How it compares to Chameleon?

Hi, thanks for your brilliant work, release of the paper, weights(as far as i understood, there's more to be released!), and code. I'm very thrilled by your achievements in omni-modal field, it reall…

DEBIHOOD updated 2 weeks ago
1
kyegomez/MambaTransformer #14

[BUG] layer norm called multiple times with same parameters

In the module: `MambaTransformer/mamba_transformer`, you execute the following in `class MambaTransformerblock`: ```python # Layernorm self.norm = nn.LayerNorm(dim) def forwa…

erlebach updated 1 week ago
1

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for norm

1000+ results
for norm