-
add a build matrix for that, and fix the errors.
You shouldn't use that though. It's only to ease taking patches upstream.
Supporting this would have made it easier to take patches to upstream.
But …
-
### 🐛 Describe the bug
When running the command make triton , the following error is generated,
`export USE_XPU=1
make triton`
Building wheels for collected packages: triton
Building wheel …
-
### 🐛 Describe the bug
Currently, the Inductor CPP Backend includes a design for split loop levels, where the inner loop level can be divided into a vectorization loop level and a scalar loop level…
-
### 🐛 Describe the bug
Torch.cumprod will silently cast the output data type to torch.int64 regardless of the actual input's dtype.
It would be better if these APIs can keep the output's dtype to …
-
### 🐛 Describe the bug
Under certain circumstances `nn.Linear` will have memory leaks on MPS. The exact failure mode and condition that leads to leakage is unclear at this moment. I'll give an upda…
-
Commit c844eac5926d1efbdfbf2e8bcc3989ba6a6aee50 has triggered [CPANtesters failures for the Devel-Cover distribution](http://fast-matrix.cpantesters.org/?dist=Devel-Cover;perl=5.41.2;reports=1#).
[…
-
### 🐛 Describe the bug
In deepspeed workloads, Zero parameter offload is implemented via module hooks. We find under torch.compile scenrio, if there any graph breaks happen in the pre/post hook of a …
-
Devel::GoFaster provide a prototype to optimize the optree for some common patterns
```
//= 1
```
@arc was also suggesting in https://github.com/p5h/p5summit-2019/issues/18
to consider
```
E…
-
### 🐛 Describe the bug
This issue was discovered in #134184 and left as follow-up work.
There's a correctness issue in `nn.ReplicationPad1d` and `nn.ReplicationPad2d` with certain shapes as inpu…
-
In C code it's common to walk the child ops of a given optree node with code such as:
```C
OP *kid = o->op_first;
while(kid) {
...
kid = OpSIBLING(kid);
}
```
The sibling list is termi…