avx512 Search Results - Githubissues

1000+ results
for avx512

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/executorch #4475

error when "call_function:aten.copy.default" can not be lowe…

### 🐛 Describe the bug when I use executorch to lower my transformer-based model to xnnpack backend.I meet the error ``` INFO:executorch.backends.xnnpack.partition.xnnpack_partitioner:…

TaylorYangX updated 3 months ago
5
JuliaSIMD/CPUSummary.jl #6

CPUSummary.jl v0.1.14 breaks CI of Trixi.jl on skylake-avx51…

We observed some specific problems when going from CPUSummary.jl v0.1.8 to v0.1.14 at [Trixi.jl](https://github.com/trixi-framework/Trixi.jl). Everything is fine with the old version of CPUSummary.jl.…

ranocha updated 2 years ago
8
JayDDee/cpuminer-opt #225

Lyra2 performance paradox

Changes to avx512 lyra2 code in sponge-2way.c for v3.11.2 produced improvements of between 6% for x21s and 47% for lyra2z. However, peformance dropped 9% for x22i and 5% for x25x. It's easilly repro…

JayDDee updated 7 months ago
13
intel-analytics/ipex-llm #12374

Several GPU models behave erratically compared to CPU execut…

Here is a trace from my Intel Arc A770 via Docker: ``` $ ollama run deepseek-coder-v2 >>> write fizzbuzz """"""""""""""""""""""""""""""" ``` And here is an trace from Arch linux running on …

pepijndevos updated 1 day ago
8
spack/spack #33712

Installation issue: openmpi@4.1.4

### Steps to reproduce the issue ```console $ spack install openmpi@4.1.4 %gcc@7.3.0 +legacylaunchers +gpfs +pmi schedulers=slurm >> log.openmpi 2>&1 ``` ### Error message Error message ==> In…

nish-ant updated 2 years ago
1
pytorch/pytorch #139207

Major perf regression with `BatchNorm2d` + `torch.compile` w…

### 🐛 Describe the bug Since PyTorch 2.5.0, there is a massive (more than 10x) performance regression when using `BatchNorm2d` with `torch.compile` set to `reduce-overhead` and `DistributedDataPara…

atafra updated 3 days ago
3
pytorch/pytorch #132020

`torch.Tensor.to` ignores `memory_format` kwarg

### 🐛 Describe the bug Minimal reproducer: ```python import torch x = torch.ones(1).expand(2) print(f"{x.is_contiguous()=}") print(f"{x.to(memory_format=torch.contiguous_format).is_contiguous(…

timmoon10 updated 4 months ago
1
llvm/llvm-project #97271

[X86] VNNI intrinsics argument types don't match the actual …

For example: `__m128i _mm_dpbusd_avx_epi32 (__m128i src, __m128i a, __m128i b)` This takes 1 x "src" and 2 x "a * b" multiplication inputs but the clang/llvm intrinsics are defined as: ``` TA…

RKSimon updated 5 months ago
1
SeisSol/SeisSol #1213

HDF5 error: unable to synchronously open attribute (netcdf-c…

**Describe the bug** I m trying to run a setup of Vincent. It reports it hangs at t=0. But when I try to run it, I experience a hdf5 error: ``` Wed Nov 20 09:20:45, Info: 9 (dr): 0 Wed Nov 20 …

Thomas-Ulrich updated 1 week ago
6
vllm-project/vllm #6777

[Performance]: Medusa SD have poor performance than baseli…

### Proposal to improve performance Test new feature medusa speculative sampling with [vllm v0.5.2](vllm-openai:v0.5.2). After using Medusa speculative sampling, the performance dropped significantl…

deepindeed2022 updated 1 month ago
6

上一页 1...85 86 87 88 89 90 91...100 下一页

1000+ results for avx512

1000+ results
for avx512