Currently, if there is (full or partial) overlap between an input and an output Store in an operation, we handle this at the cuNumeric level, since the core will not do this check for us:
if store1.overlaps(store2):
store1 = store1.copy()
task.add_input(store1)
task.add_output(store2)
However, if the following are all true:
store1 and store2 are backed by the same RegionField (as a corollary, they overlap fully)
store1 and store2 have the same store transformation (may be unnecessary?)
then we can avoid making the copy, because we can expect the core to coalesce the read requirement for store1 and the write requirement for store2 into the same read-write requirement.
We must be certain that the core will apply the coalescing transformation, otherwise we will get tasks with conflicting region requirements, which Legion will not catch in release mode.
Currently, if there is (full or partial) overlap between an input and an output Store in an operation, we handle this at the cuNumeric level, since the core will not do this check for us:
However, if the following are all true:
store1
andstore2
are backed by the sameRegionField
(as a corollary, they overlap fully)store1
andstore2
have the same store transformation (may be unnecessary?)task.add_alignment(store1, store2)
then we can avoid making the copy, because we can expect the core to coalesce the read requirement for
store1
and the write requirement forstore2
into the same read-write requirement.We must be certain that the core will apply the coalescing transformation, otherwise we will get tasks with conflicting region requirements, which Legion will not catch in release mode.