Open ThrudPrimrose opened 5 days ago
I can fix my issue by introducing a host_map
and host_data
fields and setting them explicitly.
This prevents the issue for my use case.
But still, transients outside map scopes should not be mapped to the GPU I think.
Transient scalars are always mapped to GPU storage, even if it is in a Default map that is not mapped to GPU.
I can prevent a map from being offloaded to a GPU by setting it as a "host_map". It is a small feature I have added. Roughly at line in
gpu_transform_sdfg.py
315-225, I can prevent a Default map from being mapped to GPU (GPU_Device) by using an additional variable on the Map Node:Note: This change is not present in the latest commit.
However, I can't get it not to put the transient Scalar to host storage, even when it is in a map that is a "host_map".
I will attach an SDFG to reproduce the behaviour: (Rm .json at the end github does not support sdfgz format) cut_2.sdfgz.json
We have two chains that have a (tasklets->access node) pattern, initializing data containers (writing to access nodes). Even if I put these chains within a trivial map or not "levmask" variable is always mapped to GPU_transient storage.
Running:
sdfg.apply_gpu_transformations(validate = True, validate_all = True, permissive = True, sequential_innermaps=True, register_transients=False, simplify=False)
on this map, results with invalid code because levmask is mapped to GPU_Global storage, but tasklet is on host. To reproduce download the SDFG and run this script:If I encircle in a map, then there is no problem. If I encircle this tasklet with a map, but then decided that this map should stay on host - it still maps the "levmask" to GPU_Global storage with the map schedule on CPU. TO reproduce you can sue this script:
This script puts a trivial map around the (tasklet -> access node) chain, and then sets any map that reads from or writes to a data container named "levmask" as a host map. The SDFG looks as follows: cut_2_preprocessed_2 .sdfgz.json