avx512 Search Results - Githubissues

1000+ results
for avx512

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/pytorch #141212

Issue with random normal with DTensor

### 🐛 Describe the bug Tensor dim 0 is already sharded on mesh dim 1, DTensor operator implementation does not support things like hybrid sharding strategies yet (i.e. [Shard(0), Shard(0)]) this i…

mayank31398 updated 10 hours ago
2
microsoft/onnxruntime #22905

[Performance] Binary operators using SSE on AVX systems

### Describe the issue Hi! I've been building ORT using the command and noticed binary operators like _Add_ are being executed by the Eigen library, I did some debugging and noticed Eigen is using t…

eralmual updated 1 hour ago
1
dotnet/runtime #88946

[API Proposal]: Expose remaining AVX512-VBMI2 hardware instr…

### Background and motivation There are approved and soon to be added [AVX512-VBMI2 Compress & Expand intrinsics](https://github.com/dotnet/runtime/issues/87097) as part of new vector mask proposal…

MadProbe updated 2 weeks ago
5
animetosho/par2cmdline-turbo #35

Error in makefile when compiling on rocky linux 8

I am using the following commands to build par2cmdline-turbo in a rocky linu8 container. ```shell git clone https://github.com/animetosho/par2cmdline-turbo.git cd par2cmdline-turbo aclocal automa…

Exist2Resist updated 1 month ago
4
tensorflow/tensorflow #80331

Aborted (core dumped) in `tf.raw_ops.MatrixSolve`

### Issue type Bug ### Have you reproduced the bug with TensorFlow Nightly? Yes ### Source source ### TensorFlow version tf 2.17 ### Custom code Yes ### OS platform and d…

x0w3n updated 8 hours ago
2
Ji-Peng/eng25519_artifact #1

Question about EC operations performance

It seems that Intel also implements curve25519 based on AVX512-IFMA. Have you compared the performance of the two implementations? https://github.com/intel/cryptography-primitives/tree/5ada2314016b…

changtong9 updated 1 week ago
1
tensorflow/tensorflow #80316

Aborted (core dumped) in `tf.raw_ops.MatrixInverse`

### Issue type Bug ### Have you reproduced the bug with TensorFlow Nightly? Yes ### Source source ### TensorFlow version tf 2.17 ### Custom code Yes ### OS platform and distribution Linux U…

x0w3n updated 8 hours ago
2
Mysticial/Flops #26

Optimized binaries for all new architectures ?

Hello. I see that for Intel AVX2 code you keep the same binary from Haswell (2013) architecture and the same goes for Intel AVX512 using SkylakePurley (2017) executable and for AMD AVX2/AVX512 you ha…

NikosDi updated 2 months ago
4
tensorflow/tensorflow #80312

Aborted (core dumped) in `tf.raw_ops.Cholesky`

### Issue type Bug ### Have you reproduced the bug with TensorFlow Nightly? Yes ### Source source ### TensorFlow version tf 2.17 ### Custom code Yes ### OS platform and d…

x0w3n updated 1 day ago
1
llvm/llvm-project #97694

[AVX512] Replace x86_avx512_mask_pmov_* register intrinsics …

We already expand basic truncation intrinsics to the trunc+shuffle sequence: ```cpp static __inline__ __m128i __DEFAULT_FN_ATTRS128 _mm_cvtepi64_epi8 (__m128i __A) { return (__m128i)__builtin_s…

RKSimon updated 4 months ago
1

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for avx512

1000+ results
for avx512