-
**Describe the bug**
Setting the kernel_properties to include grf_size for an ESIMD kernel does not change the register file size.
**To Reproduce**
```c++
#include
#include
#include
int ma…
-
part of #7338
beta, gamma, invgauss and recipinvgauss kernels can be obtained through sccipy's distributions, with appropriate parameterization.
Birnbaum-Saunders (fatiguelife) should also be pos…
-
Many libcudf users have expressed interest in using a 64-bit size type (see #3958 for reference). The `cudf::size_type` uses a `int32_t` data type that limits the number of elements in libcudf columns…
-
Currently, our GPTQ kernels only support the float16 precision.
-
Currently the CUDA target prepares for launch and is launched entirely from Python code. This should be replaced by CPU JIT code such that the launches are free of python, this reducing the launch ove…
-
The labels of the python kernels in the launcher are hard to read, especially when you have a large number of kernels. I had to hover over them to read the full name.
Below is an example of runnin…
-
# Bug Report
---
## Describe the bug
When there tags added to digitalocean_kubernetes_cluster resources, they are reconciled everytime. I suspected is because when the clusters are crea…
-
From latest information on CUDA Graphs, follow the following rules of thumb:
- always use CUDA Graphs to start kernels, it will always be at least the same speed or faster as not using task graphs,…
-
Hi,
When running the 'make alltuners' on a Mali GPU, some tunes run hours long. And finally it stuck there and never return. Are there any methods to speed up?
-
### Describe the bug, including details regarding any error messages, version, and platform.
I have a chunked array made of view/slices of the same array.
When I call if_else on that array, the …