avx512 Search Results - Githubissues

1000+ results
for avx512

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

mratsim/laser #21

[GEMM] Enhance serial implementation

With #20, the parallel schedule seems to scale perfectly on many cores: ``` $ OMP_NUM_THREADS=1 OPENBLAS_NUM_THREADS=1 ./build/gemm_f32_serialWarmup: 0.9036 s, result 224 (displayed to avoid comp…

mratsim updated 5 years ago
1
PaddlePaddle/Paddle #65068

develop 版本 padlde 安装后 Illegal instruction (core dumped)

### bug描述 Describe the Bug 下面这种环境无法运行paddle develop ``` python -m pip install paddlepaddle-gpu==0.0.0.post120 -f https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html ``` ``` Python 3…

GreatV updated 4 months ago
10
instructlab/instructlab #2333

AssertionError Exception: from await original_route_handler(…

**Describe the bug** I am trying to run the command ilab data generate --taxonomy-path ./taxonomy when I am getting this error **To Reproduce** Setup ilab with latest release version on a RHE…

pvrbharg updated 1 week ago
2
madgraph5/madgraph4gpu #177

Multi-SIMD-mode executables?

This is a spinoff of vectorisation issue #71 and a followup to the big PR #171. --- (The first part of this description also serves as documentation of what is available there now!). The curr…

valassi updated 2 years ago
2
sdsc/spack #50

SDSC: PKG - expanse/0.17.3/gpu/b - Missing Amber GPU(example…

nwolter updated 1 year ago
14
ggerganov/llama.cpp #9838

Bug: Llama.cpp with cuda support outputs garbage response wh…

### What happened? ``` You are a helpful assistant > what is 2+2+2+2 44444444444444444444444444444444444444444444444444444444444444444444444444444444444444444 > ``` When I run llama-cli with…

bmahabirbu updated 1 week ago
7
pytorch/pytorch #106614

Case study of torch.compile / cpp inductor on CPU: min_sum /…

### 🐛 Describe the bug (I'll add actual benchmarking details and logs and output_code.py in a bit) I'm doing min_sum and mul_sum in two setups: 1. (D, ) x (D, ) -> scalar 2. (B, N, 1, D) x (B,…

vadimkantorov updated 7 months ago
17
pytorch/pytorch #122478

6470 segmentation fault (core dumped)

### 🐛 Describe the bug I have installed pytorch on my python venv as well as my conda environment using the download format as given on the official pytorch documentation with cuda11.8 as: `pip3 i…

Rishabkashyap14 updated 3 months ago
6
llvm/llvm-project #94419

Math function vectorization failure with AVX-512

I am writing a machine learning software that needs to compute “Y = exp(a⋅X)”. Sample code: ```c++ #include #include void func(float a[]) { for(std::size_t i = 0; i != 16; i++) { …

m13253 updated 1 week ago
3
halide/Halide #5138

The simplifier drops unused LetStmts even if the RHS has sid…

I passed a Target object with NoAssert turned on to compile_to_lowered_stmt on a pipeline of mine. Here's what I get: module name=117_0_t, target=x86-64-linux-avx-avx2-avx512-avx512_skylake-f16c-fm…

benoitsteiner updated 1 year ago
6

上一页 1...80 81 82 83 84 85 86...100 下一页

1000+ results for avx512

1000+ results
for avx512