-
## Description
this is another blas library which is faster than openblas, and unlike it, it supports avx512 as well. although not as fast as mkl in x86 and x86_64 processors, but it outperforms othe…
-
1- I used PuTTY to connect to Beluga
2- I updated remoll
3- I ran the command "build/remoll macros/pion/pionDetectorLucite.mac"
4-There was a crash as following
The lines below might hint at the…
-
Just ran a profiler on the enes student http://statmt.org/bergamot/models/ . This was in the `intgemm_reintegrated_computestats` branch, but the Select function is the same in master.
```
Each …
-
### Your current environment
```text
The output of `python collect_env.py`
Collecting environment information...
WARNING 10-04 10:39:09 rocm.py:13] `fork` method is not supported by ROCm. VLLM_WOR…
-
Hi,
First of all - thanks for creating (and open-sourcing) this swift code! Looks great!
I was looking through the SIMD wrappers for `AVX512F` in `vector.h` and I noticed a few wrappers that re…
-
from @xaionaro:
> theoretically you may use fittool to disable new microcodes, which disables AVX512 instructions on cheap Intel processors (to reenable them back).
see https://github.com/linuxboo…
-
OpenBLAS already added flang support, but I don't think this is being tested on windows? While reviving the old [effort](https://github.com/conda-forge/openblas-feedstock/pull/115) to build conda-forg…
-
### Describe the issue:
In a simple GUI visualization code, if I import open3d before numpy, the program runs normally. However, if I import numpy first and then open3d, there will be an OpenGL loa…
-
Dear friends,
I am having a function implemented using the avx512, avx2, sse4_1 and sse2, four versions in total using cpp.
I am trying to identify if the "avx512bw", "avx2", "sse4_1", "sse2" are …
-
### 🐛 Describe the bug
RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "/opt/conda/conda-bld/pytorch_1695392020201/work/c10/cuda/CUDACachingAllocator.cpp":1154, please report a bug to PyTor…