-
Hello team,
we have been noticing some pretty large deviations between the attention output of flash/unfused attention versus the fused attention kernels when sliding window attention is active. The …
-
**Documentation:**
[torch.export](https://pytorch.org/docs/stable/export.html)
**Examples:**
[exported_program.pt2.zip](https://github.com/lutzroeder/netron/files/13855206/exported_program.pt2.zip)
[…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report.
### Ultralytics YOLO Component
…
-
Hi,
Do you plan to add tracking of the PC's parameters by PyTorch? For example, a switch that would change the `requires_grad` of the PC's parameters from `False` to `True`. I would like to use the…
-
### Checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the [latest version](https://github.com/prefix-dev/pixi/releases) of pixi, …
-
Hello team,
we noticed discrepencies when using the `transformer_engine.pytorch.TransformerLayer` in combination with fused attention kernels and multi/group-query attention, `fuse_qkv_params` and `q…
-
Today, the canary repo is kept in sync on an ad-hoc basis. Let's automate it
The one complicating factor is that there are a couple commits that are pytorch-canary specific and need to be reapplied …
-
**Feature Overview**
This Feature card is for transitioning our model training infrastructure from DeepSpeed to PyTorch's Fully Sharded Data Parallel (FSDP) to enhance training metrics visibility, bro…
-
## 📚 Documentation
xla/docs/README contains the following text. Is this text still relevant? The link to CircleCi is broken and I'm not sure if this information is useful:
------------------------…
-
### Checklist
- [X] The issue exists after disabling all extensions
- [X] The issue exists on a clean installation of webui
- [X] The issue is caused by an extension, but I believe it is caused by a …