simt Search Results - Githubissues

495 results
for simt

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

iree-org/iree #13202

[CUDA] error: 'func.func' op exceeded GPU memory limit

### What happened? Hitting this error in CUDA compilation: ``` error: 'func.func' op exceeded GPU memory limit of 166912 bytes for function. Got 5767168 bytes ``` [Full error log](https://gis…

silvasean updated 1 year ago
25
NVIDIA/cutlass #736

[QST] cutlass profiler build issue

I try to follow the guidance of https://github.com/NVIDIA/cutlass/blob/master/media/docs/profiler.md to compile the cutlass profiler, But I get stuck trying to execute: ```shell $ make cutlass_pro…

Harunokaze updated 1 year ago
1
NVIDIA/cutlass #725

[FEA] Add Support for SIMT Ops in GEMM and CONV to Match Cur…

**Is your feature request related to a problem? Please describe.** There are many algorithms for gemms and convs that are designed specifically for TensorOps. For example, any of the algorithms that …

aadulla updated 1 year ago
2
NVIDIA/cutlass #782

[QST] is rank_k supported on sm_75?

**What is your question?** If I understand correctly, rank_k kernels are by default for sm_80 and newer devices. And it seems I cannot run any rank_k opterations with cutlass_profiler on a T4 device.…

jxybb updated 1 year ago
5
nim-lang/RFCs #160

Project Picasso - A multithreading runtime for Nim

# Project Picasso - a multithreading runtime for Nim _"Good artists borrow, great artists steal." -- Pablo Picasso_ ## Introduction The Nim destructors and new runtime were introduced to pro…

mratsim updated 1 year ago
30
NVIDIA/cutlass #856

How to implement int8 complex GemmBatched?

Hi, I want to implement int8 complex GemmBatched for my project to run on sm70 device.(uint8 * uint8 = uint32) May I ask what's the best way to do it?

zhangyilalala updated 1 year ago
15
NVIDIA/cutlass #610

[BUG] CUDA Error CUresult.CUDA_ERROR_ILLEGAL_ADDRESS when us…

**Describe the bug** I am trying to do a gemm between two fp32 arrays using the python api to produce a fp32 output. I would like to leverage tensor cores for this operation. I modified the the …

rkindi updated 2 years ago
4
NVIDIA/cutlass #675

[QST] Does simt kernel support gather-gemm-scatter fusion?

Hello! I write a custom simt kernel to do gather-gemm-scatter fusion. The profiler picks the kernel settings. But I find it will give the wrong result for gather-gemm-scatter. Does the simt kernel sup…

umiswing updated 2 years ago
2
seanbaxter/circle #49

Question: What is the state of SIMD support in circle?

Hi, I'm interested in using circle's metaprogramming tools to extend an existing codebase that makes use of SIMD intrinsics, but it seems to be failing to compile. A small example [here](https://godbo…

samuelpmish updated 1 year ago
16
TUDelft-CNS-ATM/bluesky #452

why speed is not unchanged？（importing BlueSky as a python pa…

I initialized a plane named AC1 with 250 m/s speed.But it is ... ![myplot](https://user-images.githubusercontent.com/84360925/218401691-aeb47996-4125-484a-bcf6-d06ef5049075.png)

dfz-cauc-2020 updated 1 year ago
6

上一页 1...25 26 27 28 29 30 31...50 下一页

495 results for simt

495 results
for simt