-
```
$ NVFUSER_DUMP=fusion_ir bin/nvfuser_tests --gtest_filter=*Noncancellable_CatOnlySubsetOfSplitOutputs
```
```
[ RUN ] MoveSplitCatTest.Noncancellable_CatOnlySubsetOfSplitOutputs
%ker…
-
Hey folks, stumbled into a CUDA assumption (on my non-CUDA machine)
Here's the fix for me, but it's obviously not very general
```
diff --git a/functorch/_src/compilers.py b/functorch/_src/comp…
-
```"B:\python\lib\site-packages\torch\lib\nvfuser_codegen.dll" or one of its dependencies.```
I ran the mentioned bat file, as I read recent issues, yet this did not help me fix the problem. The fi…
-
```
a = torch.tensor((0.0011-1.5705j,), device='cuda', dtype=torch.complex64)
fs = Fusion()
with FusionDefinition(fs) as fd:
nv_a = fd.define_tensor(sizes=a.shape, strides=a.stride(), dtype=…
-
### 🐛 Describe the bug
Benchmark commandline:
```
PYTORCH_NVFUSER_DUMP=python_definition,fusion_args python -u benchmarks/huggingface.py --training -d cuda --fast --backend nvprims_nvfuser --skip-a…
-
# Background
reshape/view in nvfuser doesn't imply memory alias, so we'll be referring to this as reshape in this issue to keep the conversation simple and accurate.
nvfuser reshape is implement…
-
### 🐛 Describe the bug
```python
# debug_aev_nvfuser_minimal.py
import torch
torch._C._jit_set_nvfuser_single_node_mode(True)
torch._C._debug_set_autodiff_subgraph_inlining(False)
torch.ma…
-
repro (pjnl-20240910):
```
NVFUSER_ENABLE=kernel_debug PYTORCH_NO_CUDA_MEMORY_CACHING=1 compute-sanitizer bin/nvfuser_tests --gtest_filter='*FusionReductionWithTrivialReduction_CUDA*'
```
sample stac…
-
### 🐛 Describe the bug
`torch.jit.script` give uncertain results using `torch.half`. The result of the first execution of the function is different from that of the second execution, but the result…
-
Currently when parallel_compile is enabled (Note this is also our current default behavior), `FusionKernelRuntime` hides all compilation error messages. See here:
https://github.com/NVIDIA/Fuser/blob…