-
Hello, StarPU team!
We are developing an asynchronous inference engine based on StarPU. Main thread of our Python program creates new data handles, inserts tasks with these handles and does `starpu…
Muxas updated
2 months ago
-
It seems that during the updates introduces between 1.3 and 1.4, the asynchronous partitioning is broken. In basic, we have a code
```
starpu_data_partion_plan(....) ;
execute tasks on the parti…
-
Hi,
I'm trying to define a custom backend using StarPU (https://starpu.gitlabpages.inria.fr/) by taking example of the loky implementation (using Future as the return of the submitted task), but I'…
-
### Steps to reproduce
I am trying to use DARTS scheduler from the latest master branch (commit 4131e05d441f6aa3004632c61e982c63f2496cb9 of Gitlab) and get the following error:
```
/trinity/home/…
Muxas updated
5 months ago
-
Hi!
I have just added `starpu_data_invalidate_submit` to my code. Of course, I did it with mistakes. Some cases were reported by StarPU, signaling that some data is not initialized to be read. But …
Muxas updated
7 months ago
-
### Steps to reproduce
On Frontier, load rocm/6.0.0 and try to build StarPU 1.4.7.
### Obtained behavior
Linking fails with:
```
CCLD implicit_stencil
ld.lld: error: unable to find…
-
### The Issue
On a GPU node when switching from StarPU version 1.3.11 to 1.4 versions we experience strange performance drop. For our new software [NNTile](https://github.com/skolai/NNTile) it resu…
Muxas updated
8 months ago
-
Hi!
Tracking amount of processed GFLOPs per each computing device is a nice feature of StarPU profiling. However, tracking memory accesses is also very helpful for memory-bound tasks. This is totally …
Muxas updated
3 months ago
-
Hi! I implement a Python program, that uses StarPU under the hood. The Python program simply calls Python/C++ wrappers, which pass execution to C++ routines which then call StarPU task-related functio…
Muxas updated
7 months ago
-
Switching off CUDA version of total_sum_accum at src/starpu/total_sum_accum.cc gets us correctly-looking value of loss. Loss is calculated by total_sum_accum and, in case of a CUDA version, outputs in…
Muxas updated
4 months ago