-
### Description
We've been doing some performance analysis and have noticed that on bare-metal, a PyTorch image conversion from RGB to YUV will take over 1s for a sample image and on bare-metal it ta…
jseba updated
9 hours ago
-
### Background and motivation
It seems like that https://github.com/dotnet/runtime/issues/87097 lacks intrinsics for `compress` instructions with merge-masking.
Merge-masking for `vpcompress*` and…
-
OS: debian 9
GCC: 6.3
NASM: 2.12.01
CPU: Intel(R) Xeon(R) Silver 4314 CPU @ 2.40GHz
ISA-L: 2.31
I have confirmed that my cpu supports AVX512 through https://ark.intel.com/content/www/us/en/ark/…
-
Same as for iqtree v1
https://github.com/Cibiv/IQ-TREE/issues/216
-
I know they're getting removed soon from support, but am curious if this project supports the AVX512 instructions for the 72XX (7210 to 7295) Intel KNL / KNM processor family?
How hard would it be …
-
Hi,
I have written a recipe that generates binaries for different cpu instruction sets:
https://github.com/bioconda/bioconda-recipes/pull/50691
Sometimes, the Azure workflow is successful (http…
-
Feature gate: `#![feature(stdarch_x86_avx512)]`
This is a tracking issue for the AVX-512 (and related extensions) intrinsics in `core::arch`.
### Public API
This feature covers all of the int…
-
### Describe the feature
Currently CRC32 implementation is not optimized for SSE4.2 or AVX512.
### Use Case
Provide better performance for CRC32 using HW optimized implementation
### Proposed S…
-
Hello.
I see that for Intel AVX2 code you keep the same binary from Haswell (2013) architecture and the same goes for Intel AVX512 using SkylakePurley (2017) executable and for AMD AVX2/AVX512 you ha…
-
I am using the following commands to build par2cmdline-turbo in a rocky linu8 container.
```shell
git clone https://github.com/animetosho/par2cmdline-turbo.git
cd par2cmdline-turbo
aclocal
automa…