-
Hi
I'm running avx512bw test on my SKL which has avx512bw supported,
while I got illegal instruction traps, and after some investigation, it seems
vpermb/vpermi2b belongs to avx512vbmi instead, t…
-
### 🐛 Describe the bug
```python
d = torch.rand((2, 30, 2))
dis_matrix = torch.cdist(d, d)
diag = torch.diag(dis_matrix[0])
print(diag)
print(all(diag == 0))
```
The diagonal of the distan…
-
## Issue description
test_ops.py failing:
```
======================================================================
ERROR: test_python_ref_meta__refs_linalg_svd_cpu_complex128 (__main__.TestCom…
-
This issue's URL is referenced in the message users see when they try to install Kite on a machine that doesn't support the AVX instruction set.
This can be a place for discussion and so folks can …
-
This issue is a placeholder for future discussion about supporting 4-dimensional-reducing dot-product instructions taking 8bit inputs and accumulating into 32bit, i.e.
```
int32_accumulator += int…
-
### 🐛 Describe the bug
Forwarding a tensor `img` through a simple PyTorch Conv2d model produces a different result than forwarding `img + torch.zeros_like(img)`.
Here is a minimal example: https…
dozed updated
9 months ago
-
### 🐛 Describe the bug
The only way to call an FSDP model (e.g. `fsdp_model(inputs)`) seems to be if `torch.cuda.current_device()` returns the rank/id of the current process/device, regardless of w…
-
### Describe the issue:
For some cases, when we use `np.full` with a fill_value that is an integer and with `dtype=np.float32`, we get a result off by 1.
### Reproduce the code example:
```python…
-
### Description of the bug
PrusaSlicer crashes when opening the STL file from an archive.
First, the slicer hangs witgout consuming much RAM, but it consumes around 1 core of the CPU before crashi…
-
### 🐛 Describe the bug
I have a two layer network. The input is a 2D array of token ids, first layer is an embedding layer that replaces each pixel the respective embedding, the second layer does a c…