-
Greetings, thanks for all the hard work. I got latest JCuda to build on Orin Nano via commenting out breakages. Forked repository at [github.com/neocoretechs/jcuda-1](https://github.com/neocoretechs/j…
-
What would it realistically take to implement support for DLSS within this project? Could we make use of DXVK-NVAPI for such feature?
This is a wonderful project and I’m looking forward to contribu…
-
The HIP example at https://github.com/zjin-lcf/HeCBench/tree/master/src/wmma-hip is similar to codes in https://rocm.docs.amd.com/projects/rocWMMA/en/latest/API_Reference_Guide.html:
The matrices A…
-
### Problem Description
Hello,
I'm trying to compile version 6.2.0 but I receive a error after a while. I configure the project with these parameters:
mkdir build && cd build
cmake \
-Wno-d…
-
In unet there is this pattern:
```
@249 = gpu::code_object[code_object=7216,symbol_name=mlir_dot_add,global=1310720,local=256,](@248,@244,@245,@247) -> half_type, {2, 4096, 5120}, {20971520, 5120,…
-
Hi I'm benchmarking vLLM on 4 * V100, and I see the performance is no better when using multiple gpus.
Seems the nccl takes most of the time. Have you ever seen this issue?
```
==54415== Pro…
-
### Description
I am working on translating the Cuda matrix multiplication samples to Spiral, and I am getting a really uninformative error for which I am not sure what to do. This might not be an is…
-
I would like to use cutlass to perform matrix multiplication within a cuda kernel. Specifically, before the matrix multiplication, I need to do something to load the input matrices A(mxk) and B(kxn) o…
-
This thread is dedicated to discussing the setup of the webui on AMD GPUs.
You are welcome to ask questions as well as share your experiences, tips, and insights to make the process easier for all…
-
It appears upstream now optionally supports AMD GPUs using ROCm (as seen here https://github.com/CompVis/stable-diffusion/issues/48) -- would it be possible to include support in stable-diffusion-ui?