-
**Motivation**
It's increasingly harder to reach SOL on newer GPU architectures, starting with A100 and H100, especially for simple kernels, like:
`thrust::transform(..., thrust::plus{})`, which basi…
-
**Describe the motivation for the feature request**
Some devices (mainly Intel GPUs, but also AMD RDNA) have user-selectable sub-group size. Since the kernel algorithms (e.g., shuffle-based reducti…
-
In order to handle a batch as fast as possible the pipe requires to be filled appropriately. The balancing algorithm for it is complex and require orchestration, and time spent on it, additionally to …
6r17 updated
3 months ago
-
I ran the code of this article: https://github.com/AI4Finance-Foundation/FinRL-Tutorials/blob/master/1-Introduction/Stock_NeurIPS2018_SB3.ipynb
I trained all algorithm in 3571 seconds using the CPU…
-
### TLDR:
Often when writing scientific algorithms we have to use some routines from cuSolver, like svd/eigh/qr. Those routines sometimes fail with unclear error messages that are not easy to unders…
-
Dear Author,
I am very interested in your work. I have successfully cloned and set up the project, but I noticed that test data is not included. To better understand and validate the performance of…
-
### Issue
I updated from a way older version (March perhaps), and noticed the rendering didn't seem to use GPU at all and was taking minutes instead of seconds. Working my way back through the commit…
Toora updated
3 months ago
-
### What is the issue?
I have noticed that when GPU VRAM gets near-full, but ollama has decided to load 2 models into VRAM, incoming requests to one model simply stall until the other model pops out …
-
### What happened + What you expected to happen
I'm just starting to learn to use ray,I have a mistake, but I don't know how to solve it
```
Failure # 1 (occurred at 2022-11-17_18-45-44)
Traceba…
-
**Describe the bug**
We recently saw a customer query where the number of dynamic partitions was 30,000 but the CPU had it at 20. I was not able to dig into the details of it, but it is clearly wrong…