gemm Search Results - Githubissues

1000+ results
for gemm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

triton-inference-server/fastertransformer_backend #111

CUDA runtime error: CUDA driver version is insufficient for …

Hi, I'm following the [setup guide](https://github.com/triton-inference-server/fastertransformer_backend#setup). I found a bug and solved it. https://github.com/triton-inference-server/fastertra…

lkm2835 updated 1 year ago
1
google/XNNPACK #5312

will you Plan to support int8 perchannel quantize for linear…

Rickustc updated 7 months ago
2
NVIDIA/cuda-samples #74

GPU power consumption doesn't go back to idle state after CU…

I am running some experiments using NVML and CUDA GeMM implementation for power consumption. I measured the following trend of power consumption for multiplication of two 16384 sized square matrices. …

afzalxo updated 3 years ago
1
NVIDIA/apex #530

Tensor core usage profiling - Turing architecture

Hi there I am checking `TC - tensor core usage` counter for a standard resnet50 model and although I see tensor core kernels being invoked, their corresponding `TC` counter still shows `-`. Am I do…

SrivastavaKshitij updated 3 years ago
11
NVIDIA/cutlass #1556

[QST/BUG] why cute kernel transfers so much data between L2 …

**What is your question?** I am learning to use cute to build a hgemm kernel. Tested on A10 GPU, the cute kernel is good with small problem size such as m/n/k = 4096, but I found it's much slower …

irasin updated 1 month ago
9
triton-inference-server/tensorrtllm_backend #573

Inference server stalling

### System Info - tensorrtllm_backend built using Dockerfile.trt_llm_backend - main branch tesnorrt llm (0.13.0.dev20240813000) - 8xH100 SXM - Driver Version: 535.129.03 - CUDA Version: 12.5 …

siddhatiwari updated 3 weeks ago
4
microsoft/onnxruntime #20788

[Build] MoE related unit tests fail for older architectures …

### Describe the issue MoE unit tests fail on older architecture. The tests have a particular requirement. If that requirement is not met it is pointless to run the tests. ### Urgency _No response…

yuslepukhin updated 4 months ago
4
ZFTurbo/KAGGLE_DISTRACTED_DRIVER #2

AssertionError: AbstractConv2d Theano optimization failed

This is the error encountered when running model.fit AssertionError: AbstractConv2d Theano optimization failed: there is no implementation available supporting the requested options. Did you exclude b…

Stacktohack updated 7 years ago
3
bluss/matrixmultiply #25

Allow operations on transposed matrices, i.e. Op(A) and Op(B…

I wanted to use the integer gemm code from 0430cf0, and realized that there currently is no way of performing an operation on transposed matrices while I wanted to perform `A^t A`. In the BLAS context…

SuperFluffy updated 5 years ago
8
IntelPython/dpctl #1843

Update the documentation for building Pybind11 SYCL Backend …

Hi, I'm trying to build the pybind11 extension mentioned under onemkl_gemv example DPCTL build with CUDA: https://github.com/IntelPython/dpctl/tree/master/examples/pybind11/onemkl_gemv Example men…

sreerajkksd updated 1 month ago
2

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for gemm

1000+ results
for gemm