vnni Search Results - Githubissues

1000+ results
for vnni

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/pytorch #113743

The cuda batched GEMM has a poor performance for bigger batc…

### 🐛 Describe the bug The batched GEMM has a poor performance for bigger batch size(`12*7*120*64*129`) with smaller matrix size(`3x3`, `3x1`): ```python import torch import time points = t…

XiaobingSuper updated 3 months ago
2
golang/go #70154

runtime: concatstring unexpected fault address

### Go version go version go1.23.2 darwin/arm64 ### Output of `go env` in your module/workspace: ```shell GO111MODULE='on' GOARCH='arm64' GOBIN='' GOCACHE='/Users/aimuz/Library/Caches/go-…

aimuz updated 2 weeks ago
5
pytorch/pytorch #140942

Segmentation fault (core dumped) in `reflection_pad1d_backwa…

### 🐛 Describe the bug Under specific inputs, `reflection_pad1d_backward` triggered a crash. ```python import torch grad_output = torch.full((2,8,2,2,2,10,9,9,), 9.87654e+09, dtype=torch.float) …

LongZE666 updated 1 day ago
1
pytorch/pytorch #141010

FlexAttention throws in multi-GPU env: ValueError: Pointer a…

### 🐛 Describe the bug # Problem When running compiled FlexAttention in a multi-GPU environment, if the device being used is not the first GPU (i.e., not `cuda` or `cuda:0`, but `cuda:1`, etc.), a…

w568w updated 1 hour ago
1
vllm-project/vllm #4065

[Usage]: Problem when loading my trained model.

### Your current environment ```text Collecting environment information... PyTorch version: 2.1.2+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A …

hummingbird2030 updated 3 weeks ago
4
pytorch/pytorch #131755

CUDA error: an illegal memory access was encountered when us…

The script to reproduce the bug. ```python import os import time import pickle import torch import threading import torch.distributed as dist import torch.distributed.distributed_c10d as c10…

workingloong updated 3 months ago
1
intel/intel-extension-for-pytorch #720

Checking AMX used in BF16: how to enable verbose?

### Describe the issue I am testing AMX's performance in BF16 inference. It turns out that under different settings of `DNNL_MAX_CPU_ISA` (`AVX512_CORE_AMX` `AVX512_CORE_BF16` `AVX512_CORE_VNNI`), …

zsym-sjtu updated 1 month ago
7
openvinotoolkit/openvino #23994

[Build]: Cant build model for OpenVINO with "GPU" device spe…

### OpenVINO Version 2023.1.0 ### Operating System Ubuntu 22.04 (LTS) ### Hardware Architecture x86 (64 bits) ### Target Platform Architecture: x86_64 CPU op-mode(s): 32-bi…

JasonSloan updated 7 months ago
2
pytorch/pytorch #140066

_foreach_norm produces wrong results

### 🐛 Describe the bug This script loads a list of tensors and diffs `_foreach_norm` and `[torch.norm(t) for t in ...]`: ``` import torch ts = torch.load('list_of_tensors.pt', weights_only=Tru…

ad8e updated 1 week ago
10
vllm-project/vllm #3544

[Bug]: Inference outputs are different by prompt length

### Your current environment ``` Collecting environment information... PyTorch version: 2.1.1+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ub…

bluepark-sk updated 3 weeks ago
1

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for vnni

1000+ results
for vnni