spcl / dace

DaCe - Data Centric Parallel Programming
http://dace.is/fast
BSD 3-Clause "New" or "Revised" License
499 stars 129 forks source link

Map over trivial tasklet #1778

Open ThrudPrimrose opened 3 days ago

ThrudPrimrose commented 3 days ago

This pass adds a trivial (one iteration) over free tasklets (scope is None).

I do this by going through the SDFG. If a node has no incoming edges and is not within scope, then I put it (and the component it is weakly connected to) in a trivial map (single iteration), if the map has no entry nodes.

If I see a nested SDFG, I recursively apply the pass. This could be an input parameter, like recursive.

I have added five test cases where the free tasklet-access-component needs to be put in a map and two cases where the pass should not change the SDFG.

tim0s commented 2 days ago

I see nothing wrong with this transformation itself, however, it seems to be a transformation working around a specific bug/problem. So I would either generalize it (encapsulate an arbitrary sub-sdfg in a trivial map), which needs some discussion how to specify the subgraph, maybe look at cutouts for that. Or, if you want to have it as a quick hack to fix a specific problem, this would be a good time to introduce experimental transformations. Any test that fails on a experimental transformation should produce a warning but not an error (suggestion).

tbennun commented 1 day ago

My big question is: why? I would also echo Timo’s comment asking why this is not an sdfg utils API call like nest_sdfg_subgraph

ThrudPrimrose commented 1 day ago

My big question is: why? I would also echo Timo’s comment asking why this is not an sdfg utils API call like nest_sdfg_subgraph

It is mainly to avoid bugs in other transformations (I have a separate bugfix PR on the to_gpu transformation) and to allow better control over the free tasklets' execution location (for example, if running the free tasklet on GPU would avoid a later memcpy).

Personally, I do not mind whether it is an experimental pass or a utility function the user can call for preprocessing. I thought it fits the concept of an optional pass the most.