-
HIP has a device-side `abort()` function that is emulated with `asm("trap;");` on the CUDA platform - see https://github.com/ROCm-Developer-Tools/HIP/issues/233
However, the behavior of `abort()` i…
-
Hello,
I have ROCm installed on Ubuntu 22.04 . (dumps from rocminfo and clinfo follow question)
I am trying various benchmarks for PyTorch.
It seems like PyTorch is still just using my CPU c…
-
We can use:
- g2d (there are two drivers for it, one in media directory but old. New is in drm directory, but it don't have support for our soc and also would require changes in libdrm)
- powervr sg…
-
### EDIT: Minimal Repro:
```
import torch
from torch._inductor import config
config.save_args = True
@torch.compile(dynamic=True)
def foo(x):
output = torch.zeros_like(x)
n = out…
-
### 🐛 Describe the bug
I was trying out the fusion benchmarking option in the inductor (`torch._inductor.config.benchmark_fusion = True`) and noticed that the triton benchmark of operators (unfused a…
-
In some cases, HIP appears to be incapable of unrolling a loop, where `nvcc` does so without a problem, as is the case in the following example:
```c++
#include
template
__global__ void my_k…
-
# Quantified with the Yolov5 model, the MAP@0.5 is high(around 0.47), but the detection results are outrageous and unexpected
These days I have tried to do some quantification with yolov5_nano by …
-
Open issue to openly discuss potential ideas or improvements, whether on documentation, interfaces, examples, bug fixes, etc.
-
### 🐛 Describe the bug
**Short description**
I use a custom autograd function that calls custom triton kernels. Inductor doesn't honor x = x.contiguous() called before the triton kernel.
**Long…
-
```
===> Testing for spheral-2023.03.0
===> spheral-2023.03.0 depends on file: /usr/local/bin/python3.9 - found
cd /usr/ports/science/spheral/work/.build && /usr/bin/env F77="gfortran12" F90="g…