-
### Subject of the issue
When using **delta_F_inv_cov="LSDI"** while calling **compute_density_BMTI**, an error appears and the call crashes.
### Your environment
colab notebook, python 3.10…
-
In fact it looks like we may silently ignore them.
The ctx methods in question are:
- ctx.mark_dirty()
- ctx.mark_non_differentiable()
- ctx.set_materialize_grads()
Repro:
```
import torch
…
-
I am trying to use L-BFGS and related optimizers with nnx + optax, but running into trouble. It might be that `optax` has a slightly different optimization interface in those cases: https://optax.rea…
-
Getting the following error when using gradient checkpointing with PEFT LoRA training.
> NotImplementedError
> self.get_input_embeddings()
```
Traceback (most recent call last):
File "/home…
-
I would like to implement the algorithm for grokfast, which is an exponentially weighted mean of past gradients added to the current gradients, with GradCache. I've been able to use it without GradCac…
-
Get an error during stage 2.
```
Traceback (most recent call last):
File "/mnt/petrelfs/wangzhao/HumanT2V/AnimateAnyone/train_svd.py", line 803, in
main(config)
File "/mnt/petrelfs/wangz…
-
### 🐛 Describe the bug
Hi !
I'm trying to backward my model along multiple directions, so I'm using `torch.autograd.grad` with `is_grads_batched=True`. I had no problem using it on a MLP, but wh…
-
Thank you for this package. I'm looking for some example on how to implement simple MLP (Multi Layer Perceptron) with this package. Any code snippets or tutorials are welcome.
Below is some code th…
-
## 🐛 Bug
The following test fails for DDP:
```
@unittest.skipIf(BACKEND != 'nccl' and BACKEND != 'gloo',
"Only Nccl & Gloo backend support DistributedDataParallel")
…
-
I am kind of confused of the ensure_shared_grads here https://github.com/ikostrikov/pytorch-a3c/blob/master/train.py#L13. Here, the `grad` is synced only when it is `None`. I think we need to set `sha…