simt Search Results - Githubissues

495 results
for simt

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

intel/intel-xpu-backend-for-triton #940

Matmul kernel scratch space exceeds HW support limit

The Triton tutorial 03-matrix-multiplication.py starts to fail after a recent software update to oneAPI 2024 with 'total scratch space exceeds HW supported limit". ocloc -spirv_input -file matmul_k…

pengtu updated 3 months ago
2
iree-org/iree #15526

SDXL Perf tracker for ROCM

Pip version used for fetching Tracy profiles : ``` iree-compiler 20231106.574 iree-runtime 20231106.574 ``` I captured an end-to-end Tracy profile along with Unet for R…

Abhishek-Varma updated 4 months ago
7
iree-org/iree #15078

Failed vector distribution for n-D vectors in fused reductio…

### What happened? I'm trying to compile a tm_tensor model through iree-compile, and this is the output: ``` Please report issues to https://github.com/openxla/iree/issues and include the crash b…

gpetters94 updated 3 months ago
7
ROCm/HIP #3263

Porting HIP to run on other (non-AMD) accelerators

Hello, If I were to have a novel accelerator, I would need to have a programming model for users to interact with it. From my understanding, AMD wants HIP to be a defacto programming model for the …

jdayal-ssi updated 5 months ago
10
nod-ai/SHARK-Studio #1245

[vulkan] Intel Arc Fails to Compile

It seems like Intel Arc support is supposed to be present — it shows up as a Vulkan device at least! But in trying to run this I get compile failures Here's the detailed log from the command prompt on…

jarredwalton updated 1 year ago
6
NVIDIA/cutlass #1520

[QST] sm70 and sm80 CuTe examples are tiling ordinary float …

In the sm_70.cu and sm_80.cu CuTe examples, am I understanding correctly that they are tiling ordinary float multiplication, and not any complex MMA instruction? https://github.com/NVIDIA/cutlass/…

ericauld updated 6 months ago
1
intel/intel-xpu-backend-for-triton #526

Hand rewrite LLVM IR for GEMM using SIMT instructions

Create a LLVMIR file that uses DPAS, 2d block read/write, and prefetch instructions to get an estimate initial performance at SIMT path. One could use compiler to generate the LLVM IR file, but not…

whitneywhtsang updated 8 months ago
20
triton-lang/triton #1192

[Triton SPIRV] Intel GPU support for Triton MLIR

Hi Dears, This is an RFC for supporting the Intel GPU in Triton language. The tool chain for Intel GPU is based on the SYCL/SPIRV spec. As a result, we'd like to upstream the SPIRV support to th…

chengjunlu updated 6 months ago
4
accel-sim/accel-sim-framework #319

Bison error at ptx_parser_decode.def

I am using a modified version of Accelsim which was fine with GCC-9.4 and CUDA 11.6 (driver and toolkit). Recently, I have upgraded the system with CUDA 12.2 (driver) and 11.6 (toolkit) and I was able…

mahmoodn updated 4 months ago
1
accel-sim/accel-sim-framework #318

Rodinia simulation config

Hi, accel-sim devolopers: - I am newer for gpu simulation, I am running the steps of README.md instructions, at the step of ./util/job_launching/run_simulations.py -B rodinia_2.0-ft -C CONFIG -T…

Leon924 updated 4 months ago
5

上一页 1...15 16 17 18 19 20 21...50 下一页

495 results for simt

495 results
for simt