-
# 🐛 Bug
## Command
python3 -m xformers.info
/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be remov…
-
**Pros**:
`no-op` in the following statements refers to no kernel launch. it would require some changes to meta-data.
- `transpose` and `reorder` are now no-ops (we can have an additional parameter i…
-
In some cases, it seem the gpu meta optimizer do not make good timming as it do not have the strides information. This will describe how to get it, in an incremental way. This is much easier then the …
nouiz updated
7 years ago
-
The incompatibility is that during backwards, fused_rmsnorm does dynamic control flow over strides, which isn't safe for export tracing used by PP.
```
dy = dy.view(-1, dy.shape[-1])
…
-
**User Story**
As a cluster operator, i want to know the list of dependencies Cluster API brings for assurance within our organisation's software supply chain.
**Detailed Description**
* Cr…
-
Last year I followed the implementation logic in this file, https://github.com/openxla/xla/blob/7954169ccfb6290d94af3ea3634229b097682ba8/xla/service/gpu/runtime/gemm.cc
and the input parameters were…
-
@dmargala has pointed out that we're not actually using the batched version of the cholesky solve, and we can benefit from strided arrays to get that to work.
This is his snippet to get this to wor…
-
I am re-implementing the enhancement of DP-SGD through the [random sparsification](https://github.com/JunyiZhu-AI/RandomSparsification) of gradients on my UNet Model.
Here is a Debug info on extend…
-
**Is your feature request related to a problem? Please describe.**
Colors typically are too bright.
**Describe the solution you'd like**
My understanding is that this can't be fixed for the …
-
Thank you for your great work. We tried fine-tuning with a learning rate of 1e-5, frame stride set to random in 1-6, resolution: [576, 1024], video_length: 16. After training for a period of time, the…