-
Would be interesting to look at SIMD for optimizing the BitIter. Unsure if it would be worth it with some extensions like AVX512 due to it generally slowing down the cpu frequency, but SSE2 is probabl…
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTor…
wxsms updated
4 weeks ago
-
I am weiting rust bindings for hexl [here](https://github.com/Janmajayamall/hexl-rs). I have added support for NTT operations and some elwise operations. However, I am running into issues with elwise …
-
**Describe the bug**
When adding libclang to SDK and used in a build the dynamic linker will use libc.so.6 from the system instead of the one from the SDK. There can be a version
This is found wh…
-
Hi, I need this patch to build ELPA as part of the CP2K 2023.1 toolchain in our setup (a rather bare CentOS base, separate compiler and MPI prefix, separate prefix for a lot of other dependencies from…
-
Hi there, I'm trying to build the environment for this repo, and I kept running into the error installing deform-conv as following error message suggests. How to solve this? Thanks!
"""
Collecting…
-
On Skylake/KNL `TensorFlow` compilation does not work. Based on @akesandgren suggestion, it compiles not using AVX512 code-set. With Intel `-xCORE-AVX2` or `-xAVX2`, with GCC `-march=native -mno-avx51…
-
As @Maratyszcza and @lemaitre point out in #7, we should consider scatter and gather operations. This is an issue to track that.
Potential topics to discuss:
- Emulation (it is only supported by…
penzn updated
3 years ago
-
The CMake file uses `-march=native`, which generates binaries that are not usable on other machines that might lack the instruction set.
I think the better approach would be to use `-march=haswell`,…
-
### Code
```Rust
// compile with `cargo run --target aarch64-unknown-linux-gnu`
fn main() {
unsafe { dbg!(__crc32b(13, 42)) };
}
unsafe fn __crc32b(mut crc: u32, data: u8) -> u32 {
…