gemm Search Results - Githubissues

1000+ results
for gemm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/Fuser #2979

Self-mapping error when compiling 1D bias linear fusion with…

The following test fails currently: ```c++ TEST_F(MatmulSchedulerTest, SelfMappingErrorSmemEpilogue1dBias) { NVFUSER_TEST_CUDA_ARCH_RANGE_GUARD(7, 5, 9, 0); Fusion fusion_obj; Fusion* fusion = …

jacobhinkle updated 4 weeks ago
1
rhymes-ai/Aria #32

How to quantize the model?

Currently having issues attempting to quantize, save, then load the model using HF Transformers. Is there any known working method for quantizing Aria (preferably to 4bit)?

iamthemulti updated 10 hours ago
2
bytedance/flux #20

Are there any difficulties in implementing gemm-allreduce?

1. In some projects, Gemm+AllReduce needs to be used. I would like to know whether Gemm+AllReduce can be implemented? and possible methods and issues. thanks. #7

Rainlin007 updated 1 month ago
2
IST-DASLab/marlin #18

[QST] Weight Format & GEMM

@efrantar Awesome work -- always enjoy your research on and implementation of efficient model inference. I was hoping that you could shed some light on the logic of the [packing](https://github…

jeromeku updated 6 months ago
2
apache/tvm #17456

[Bug] How to use auto_scheduler to generate SVE code

Hello, I am currently using auto_scheduler to automatically tune a naive gemm operator. However, after the tuning is completed, I checked the corresponding assembly code and found that the registers r…

yohuna77777 updated 1 week ago
1
oneapi-src/oneMKL #446

Missing gemm_batch data types

# Summary I believe there are some missing gemm_batch implementations, looking at the oneMKL docs it seems this should support. A `gemm_batch` with, two half matrices as input, a float matrix out, an…

AidanBeltonS updated 1 month ago
5
NVIDIA/cutlass #1459

[QST] Epilogue Broadcast: `Adapter` vs `GemmUniversal`

**What is your question?** Trying to understand the behavior of Gemm with a column-broadcasted bias vector epilogue. When defining a device `GemmUniversalWithBroadcast` with the following config: …

jeromeku updated 4 weeks ago
7
NVIDIA/cutlass #1872

[QST] Conv2D PyTorch Extension, leading wrong results.

**What is your question?** Hey folks, I am having a hard time understanding the following problem. I exported a PyTorch extension using the following code: ```python dtype = torch.int32 type_A = to…

sycz00 updated 8 hours ago
3
NVIDIA/TransformerEngine #1071

When will comm-gemm-overlap support multi nodes?

I want to use te's comm-gemm-overlap module to perform multi-node training, however the readme says this module only support single node. Does te have any plan for multi nodes support? And what effort…

umiswing updated 1 month ago
6
onnx/tensorflow-onnx #2353

False warning about unsupported doubles in runtime GEMM oper…

**Describe the bug** When writing a TF/keras model trained w/ with F64, tf2onnx warns about a lack of float64 support for GEMM by the runtime: ``` onnx_model, _ = tf2onnx.convert.from_keras(m…

kmooney47 updated 1 month ago
2

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for gemm

1000+ results
for gemm