gemm Search Results - Githubissues

1000+ results
for gemm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

tensorflow/tensorflow #75815

ruy::CpuInfo::Initialize() Null pointer dereference: SIGSEGV…

### Issue type Bug ### Have you reproduced the bug with TensorFlow Nightly? No ### Source source ### TensorFlow version tensorflow-lite:2.16.1 ### Custom code No ### OS platform and distribu…

ninh-huynh updated 4 weeks ago
4
triton-lang/triton #2513

Understanding Triton GEMM FP8 performance

Hello, we have measured the FP8 GEMM performance using Triton on NVIDIA H100 (500 W, 1980 MHz). We would like to request your help in understanding if the performance is expected. Since H100 FP8 o…

sryap updated 4 months ago
14
iree-org/iree #13641

Implicit GEMM support

New Epic item to track Implicit GEMM work. The tasks here are generally listed so that later tasks in a block depend on previous ones. ```[tasklist] ## Common Tasks - [ ] #13541 - [ ] #13627 - …

allieculp updated 1 year ago
4
NVIDIA/nccl #1433

How to handle comp-comm overlapping?

I'm facing a problem about nccl kernel overlaping with a cutlass gemm kernel. I used a cutlass gemm kernel with a grid size of and my GPU has 142 SMs, so apparently there is a surplus of SMs. Then I…

chenhongyu2048 updated 1 month ago
6
NVIDIA/cutlass #1800

[QST] kInternalError while increasing warp count in older S…

**What is your question?** Internal CUTLASS error is observed, when I try increasing the warp count for kernel "cutlass_simt_hgemm_256x128_8x2_nt_align1" to values other than default 4x2x1 (by changi…

Shreya-gaur updated 1 week ago
1
apache/tvm #17375

[Bug] Improper touched buffer assignment of Pass `MergeShare…

Lead to Suboptimal Shared Memory Reuse. pr #9341 introduced liveness analysis to merge the shared memory allocations , places touched buffer records at the outermost scope (e.g., outer loops) rathe…

LeiWang1999 updated 1 week ago
2
tracel-ai/burn #2049

Feature Request: Full Gemm operation / node

### Feature description Introduction of ONNX Gemm operation conversion https://onnx.ai/onnx/operators/onnx__Gemm.html ### Feature motivation Useful for optimisations Currently get: ``…

mtobin-tdab updated 2 months ago
1
PaddlePaddle/PaddleNLP #9206

[Bug]:编译paddlenlp_ops报错

### 软件环境 ```Markdown paddle2onnx 1.2.3 paddlefsl 1.1.0 paddlenlp 3.0.0b1 paddleocr 2.8.1 paddlepaddle 2.6.2 paddlepaddl…

Jakin-huang updated 3 days ago
3
ARM-software/ComputeLibrary #1084

sparse gemm kernels are not supported in ACL

**Output of 'strings libarm_compute.so | grep arm_compute_version':** arm_compute_version=v23.11 Build options: {'Werror': '0', 'debug': '0', 'neon': '1', 'opencl': '0', 'embed_kernels': '0', 'os…

snadampal updated 3 weeks ago
7
OpenMathLib/OpenBLAS #4890

1xK @ KxN matrix multiplication using GEMM significant slowe…

CPU Info $ lscpu Architecture: aarch64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): …

Avafly updated 1 month ago
2

上一页 1...7 8 9 10 11 12 13...100 下一页

1000+ results for gemm

1000+ results
for gemm