wmma-api Search Results

69 results
for wmma-api

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

jcuda/jcuda #39

Cuda 11.4 Jetson Orin Nano failures

Greetings, thanks for all the hard work. I got latest JCuda to build on Orin Nano via commenting out breakages. Forked repository at [github.com/neocoretechs/jcuda-1](https://github.com/neocoretechs/j…

neocoretechs updated 1 year ago
9
vosen/ZLUDA #87

DLSS Support Outline

What would it realistically take to implement support for DLSS within this project? Could we make use of DXVK-NVAPI for such feature? This is a wonderful project and I’m looking forward to contribu…

Weather-OS updated 3 weeks ago
10
ROCm/rocWMMA #387

verification fails for the simple MM example

The HIP example at https://github.com/zjin-lcf/HeCBench/tree/master/src/wmma-hip is similar to codes in https://rocm.docs.amd.com/projects/rocWMMA/en/latest/API_Reference_Guide.html: The matrices A…

jinz2014 updated 5 months ago
5
ROCm/composable_kernel #1477

[Issue]: 6.2.0 compilation issue

### Problem Description Hello, I'm trying to compile version 6.2.0 but I receive a error after a while. I configure the project with these parameters: mkdir build && cd build cmake \ -Wno-d…

RandUser123sa updated 3 weeks ago
16
ROCm/AMDMIGraphX #2767

MLIR: Fuse slice operators with a elementwise

In unet there is this pattern: ``` @249 = gpu::code_object[code_object=7216,symbol_name=mlir_dot_add,global=1310720,local=256,](@248,@244,@245,@247) -> half_type, {2, 4096, 5120}, {20971520, 5120,…

pfultz2 updated 3 months ago
4
vllm-project/vllm #2442

Bad performance when using tp = 4 on V100

Hi I'm benchmarking vLLM on 4 * V100, and I see the performance is no better when using multiple gpus. Seems the nccl takes most of the time. Have you ever seen this issue? ``` ==54415== Pro…

sleepwalker2017 updated 7 months ago
2
cupy/cupy #8156

Invalid image error when compiling a larger example with NVR…

### Description I am working on translating the Cuda matrix multiplication samples to Spiral, and I am getting a really uninformative error for which I am not sure what to do. This might not be an is…

mrakgr updated 9 months ago
5
NVIDIA/cutlass #1164

[QST] How to call cutlass API within a cuda kernel?

I would like to use cutlass to perform matrix multiplication within a cuda kernel. Specifically, before the matrix multiplication, I need to do something to load the input matrices A(mxk) and B(kxn) o…

mathfirst updated 8 months ago
8
oobabooga/text-generation-webui #3759

AMD thread

This thread is dedicated to discussing the setup of the webui on AMD GPUs. You are welcome to ask questions as well as share your experiences, tips, and insights to make the process easier for all…

oobabooga updated 4 days ago
319
easydiffusion/easydiffusion #115

AMD GPU support

It appears upstream now optionally supports AMD GPUs using ROCm (as seen here https://github.com/CompVis/stable-diffusion/issues/48) -- would it be possible to include support in stable-diffusion-ui?

prurigro updated 4 months ago
113

上一页 1...1 2 3 4 5 6 7...7 下一页

69 results for wmma-api

69 results
for wmma-api