-
I'm a beginner to try unsloth. I run the free notebook [Llama 3 (8B)](https://colab.research.google.com/drive/1bX4BsjLcdNJnoAf7lGXmWOgaY8yekg8p?usp=sharing#scrollTo=yqxqAZ7KJ4oL), and then got the fol…
-
Is there a plan to develop a memory-efficient back-propagation training mode? Perhaps a flag that by activating it, during back-propagation, the forward-pass network states get recomputed by inverting…
-
Hello, `big_vision` team!
Thanks for your work on the repository. Looking through the code I noticed that ViT is using classical attention (see [line 91 of ViT implementation](https://github.com/go…
-
-
This is a discussion of how to minimize memory usage of attention.
Current state: investigating apex's [scaled_masked_softmax](https://github.com/bigscience-workshop/Megatron-DeepSpeed/blob/main/me…
-
FusedLayerNormAffineFunction requires memory_efficient argument
https://github.com/NVIDIA/apex/blob/08f740290f999296d319ed2e3f21cd00b810918a/apex/normalization/fused_layer_norm.py#L34
but Megatr…
-
We can add a global thread-safe dictionary to de-duplicate terms in memory. This will guarantee the memory cost is the same as an efficient DAG representation while preserving the tree data structure.…
-
i have this error
can anybody help me ?
Error occurred when executing DynamiCrafterInterp Simple:
No operator found for memory_efficient_attention_forward with inputs:
query : shape=(80, 2560,…
-
Hello, I am using `torch.cuda.amp.autocast` with `bfloat16`.
I noticed that the xformers `RotaryEmbedding` produces `float32` outputs, which then requires casting before passing to `memory_efficien…
-
# 🐛 Bug
torch.jit.trace breaks with the following error:
`RuntimeError: unsupported output type: int, from operator: xformers::efficient_attention_forward_generic`
The output of the ops conta…