gemm Search Results - Githubissues

1000+ results
for gemm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/torchtitan #434

Question about custom cuda operators for tensor parallelism

We are currently trying to apply torchtitan to MoE models. MoE models require using grouped_gemm https://github.com/fanshiqing/grouped_gemm. GroupedGemm ops basically follow the same rule as in Column…

vermouth1992 updated 3 weeks ago
5
QwenLM/Qwen2 #755

RuntimeError: at::cuda::blas::gemm: not implemented for stru…

有没有朋友遇到过这样的问题。model.generate()这个位置报的错

pengmengyin updated 2 weeks ago
1
intel/intel-xpu-backend-for-triton #348

08-grouped-gemm.py poor performance

Current output of test 11: ``` group-gemm-performance: N cuBLAS Triton 0 128.0 0.11488 276574.06250 1 256.0 0.12080 276332.68750 2 512.0 0.14360 276066.46875 3 1…

prathams417 updated 5 days ago
3
NVIDIA/cutlass #1598

[QST] does cutlass depthwise conv support 1x1 filter

hi, i have a question about depthwise conv with params like 1x1 filter, stride=1, pad=0, dilation=1, i have a compile error raise by checking kWarpGemmIterations in cutlass/conv/threadblock/depthwise_…

tengdecheng updated 1 day ago
1
IST-DASLab/marlin #18

[QST] Weight Format & GEMM

@efrantar Awesome work -- always enjoy your research on and implementation of efficient model inference. I was hoping that you could shed some light on the logic of the [packing](https://github…

jeromeku updated 3 months ago
2
ROCm/composable_kernel #775

Compilation error for navi10 (use of undeclared identifier '…

Hello, I have some trouble to compile composable_kernel for my AMD GPU architecture (gfx1010) ``` cmake …

TyraVex updated 9 hours ago
26
cornell-zhang/allo #165

[Feature] Examples of how the llm pldi24-artifacts are gener…

**Is your feature request related to a problem? Please describe.** I am trying to retarget the llm artifacts to my own FPGA board. I'd like to regenerate the HLS code to try more aggressive quantizat…

bibo-msft updated 2 weeks ago
1
sarah-ek/gemm #31

gemm_f16: Build fails in debug mode for AArch64

Hi, I created a small Rust example: ``` use gemm_f16::f16; fn main() { println!("Hello, fp16!"); let a = f16::from_f32(3.1f32); let b = f16::from_f32(2.2f32); let…

brunocaballero updated 2 weeks ago
2
NVIDIA/cutlass #1595

[QST] Available Fusion Options in EVT

**What is your question?** In the examples provided, EVT demonstrates the capability to fuse different epilogue functions, optimizing their execution. I'm interested in knowing whether EVT can also i…

satyabhagavan updated 1 week ago
2
AnonymousYWL/TPDS #1

Instructions to run and execute the libshalom2 library

Hello @AnonymousYWL , Can you please provide instructions on how to use the libshalom2 library and also its gemm kernel API's? How to run a basic example code on your novel gemm kernel?

vineel96 updated 1 month ago
2

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for gemm

1000+ results
for gemm