-
Repro script: https://gist.github.com/davidberard98/3c746cd0c8bd79d40353bb0b263f9518
Usage: from the AWS cluster, reserve 2 gpus on a compute node. From there, run:
```
$ python profiler_error.py…
-
when running tests on `main`, there is a crash.
We had other issues on short seqlen and large batch on t5, not sure why...
```log
❯ pytest test/test_torchdynamo.py -k "dynamo_optimized_cuda_grap…
-
## 🚀 Feature
To further expand our support on channels last, we want to further expand this to arbitrary permutation.
The challenge would be to maintain a coherent behavior as with eager (TensorIt…
-
### 🐛 Describe the bug
Error extracted from failures encountered in https://github.com/pytorch/pytorch/pull/71299
```
PYTORCH_NVFUSER_DISABLE_FALLBACK=1 python opinfo_failure_2.py
Traceback (m…
-
## 🚀 Feature
There is a performance opportunity to fuse the Bias for the projection linear layer to LogSoftmax that can be expensive as the hidden size out of the projection is `30258`. We just need…
-
### 🐛 Describe the bug
I'm hitting issue when we have size-0 rank 1 tensors going through a reduction kernel.
```
import torch …
-
### 🐛 Describe the bug
```python
import torch
torch._C._jit_set_nvfuser_enabled(True)
torch._C._jit_set_texpr_fuser_enabled(False)
torch._C._jit_set_profiling_executor(True)
torch._C._jit_se…
-
It looks like the normalize_ir() has different behavior than I thought.
For ex, I was hoping Functionalization will replace the in-place op to standard op like relu_ to relu.
I run a simple test pro…
-
## 🚀 Feature
Need to add parser support in `parser.cpp` for `aten::_softmax` op. LTC traces the `_softmax` op variant that is different from TorchScript.
-
## 🐛 Bug
When I create a jit.script function that includes `torch.nn.functional.dropout` without a constant `is_training` parameter. The fusion does not work. This did previously work.
## To…