-
I was trying to figure out how to properly handle and update Flux's layers with tied weights ( https://github.com/FluxML/Flux.jl/issues/1592).
So first of all I wanted to check how Zygote handles a…
-
### bug描述 Describe the Bug
### 问题复现:
1. 准备一段本不能通过测试的示例
此处采用[#53078/commits/modify the test_lerp_op.py](https://github.com/PaddlePaddle/Paddle/pull/53078/commits/518866d67f79b9b1aedc3135fe669881c9a…
-
```
gradients_vars = optimizer.compute_gradients(loss, LAYERS_WIEGHTS)
grads = [grad for grad, var in gradients_vars]
train_step = optimizer.apply_gradients(gradients_vars)
```
Hi, in this code, …
-
### 🐛 Describe the bug
I can't narrow it down further, but torch.arctan2 seemingly calculates the correct gradients and optimises correctly for fp32,fp64 and bfloat16, but for some reason, the versio…
-
### 🐛 Describe the bug
I am testing the function of optimizer using torch dynamo, I found that there is a small problem in Adagrad, **state["step"]** was assigned to CPU while other parameters are …
-
### Issue type
Bug
### Have you reproduced the bug with TensorFlow Nightly?
Yes
### Source
source
### TensorFlow version
tf 2.12
### Custom code
Yes
### OS platform and d…
-
https://github.com/google/brax/blob/280a1c50fa021b6c17a2a3347fea43a2887382bc/brax/v2/math.py#L278
For people who want to calculate gradients over environment steps, this while loop is a bit annoyin…
-
**Describe the bug**
During our training sessions utilizing Megatron's Mixture of Experts (MoE) layers, we observed a decline in performance occurring at specific steps, with this deterioration manif…
-
Lammps runs and terminate after sometim
```
terminate called after throwing an instance of 'std::runtime_error'
what(): The following operation failed in the TorchScript interpreter.
Tracebac…
-
https://app.circleci.com/pipelines/github/pytorch/pytorch/371361/workflows/a14cd549-a37b-47ce-ad6a-a8baa3d40a54/jobs/15666994/steps
```
Aug 26 22:20:15 test_backward_accumulate_grads (__main__.T…