-
While it did not show up in my tests, it seems that alloca might cause a slow down although it reduces VRAM usage.
One solution is to make alloca an option and generate kernels with and without al…
-
# Feature request / Bug report
Not sure which it will come under as it is possible I am just doing something wrong in the pipeline.
## Motivation
In some GPR problems you may have a s…
-
Hello, the driver doesn't compile on Kernel versions 6.2 and higher... I admittedly have a low amount of experience with C so the best I was able to do to get it working in my fork was to comment out …
-
Can we make the block_size in the kernels more adaptive or parameterized? e.g. 1024 is pretty big for my GPU with 12GB of memory.
I have to run with block_size = 32
```c
void fused_classifier3(…
azret updated
5 months ago
-
### Problem
We have extracted out [jupyter-server-terminals](https://github.com/jupyter-server/jupyter_server_terminals), should we do the same for kernels?
### Proposed Solution
Create a `ju…
-
There has been some effort to implement these. This issue is to remind us to integrate it as appropriate.
-
Currently, they are installed in `rootfs`, and instead the `boot` directory might be placed next to it. But that is still a bit weird. Find a good way and implement it.
Note that multiple kernels…
-
I in general run realtime kernels wherever I can (leveraging ubuntu studio in my case). My principal application (ardour) requires it in order to have good flow for audio mixing. With R/T I can usuall…
dtaht updated
4 months ago
-
The [`lbmpy`](https://pycodegen.pages.i10git.cs.fau.de/lbmpy/index.html) package (repository available [here](https://i10git.cs.fau.de/pycodegen/lbmpy))provides the possibility to generate parallel co…
-
https://github.com/linkedin/Liger-Kernel/tree/main/examples/medusa
With the implementation of FusedLinearCrossEntropy and other kernels in Liger-Kernel, we are able to effectively reduce the memory…