-
Using `-march=native` and/or `-O3` compilation flags can result in a significantly (more that x2) slower executable.
The code I am seeing problems with is https://github.com/wolfpld/etcpak, specifi…
-
Changes to avx512 lyra2 code in sponge-2way.c for v3.11.2 produced improvements of
between 6% for x21s and 47% for lyra2z. However, peformance dropped 9% for x22i and
5% for x25x. It's easilly repro…
-
Since raxml-ng relies on libpll, maybe this feature request is best lodged there rather than here. A quick look shows a similar feature request open since 2013 but presumably for the KNC add-in card a…
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch…
-
**LocalAI version:**
Using Docker image:
`localai/localai:latest-aio-gpu-hipblas`
**Environment, CPU architecture, OS, and Version:**
- Ubuntu 22.04
- Xeon X5570 [Specs](https://ark.intel.c…
-
➜ ckb-miner git:(develop) cc -v
Apple LLVM version 10.0.1 (clang-1001.0.46.4)
Target: x86_64-apple-darwin18.7.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
Foun…
-
At present we are using `-axCORE-AVX2` across all executables in the `gadi-transition` branch, since that produces an executable that runs on the old Broadwell nodes as well as the new Cascade Lake on…
-
### Description
This may be a bug or a feature request, I'm not sure.
I'm building a project for Windows which needs to support CPU and CUDA backends. In .csproj I have:
```
``…
-
### Code
```Rust
// compile with `cargo run --target aarch64-unknown-linux-gnu`
fn main() {
unsafe { dbg!(__crc32b(13, 42)) };
}
unsafe fn __crc32b(mut crc: u32, data: u8) -> u32 {
…
-
My system is Ubuntu 22 running on x86_64 CPU.
I compiled with the commands in README, as follows:
```
cmake -B build
cmake --build build -j --config Release
```
Then I run the main program as fo…